ResearchHub | Open Science Community

"GrabCut"

Carsten Rother et al.Aug 1, 2004

The problem of efficient, interactive foreground/background segmentation in still images is of great practical importance in image editing. Classical image segmentation tools use either texture (colour) information, e.g. Magic Wand, or edge (contrast) information, e.g. Intelligent Scissors. Recently, an approach based on optimization by graph-cut has been developed which successfully combines both types of information. In this paper we extend the graph-cut approach in three respects. First, we have developed a more powerful, iterative version of the optimisation. Secondly, the power of the iterative algorithm is used to simplify substantially the user interaction needed for a given quality of result. Thirdly, a robust algorithm for "border matting" has been developed to estimate simultaneously the alpha-matte around an object boundary and the colours of foreground pixels. We show that for moderately difficult examples the proposed method outperforms competitive tools.

Artificial Intelligence

Theoretical Computer Science

0

Paper

Artificial Intelligence

5,631

0

Save

0

TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context

Jamie Shotton et al.Nov 30, 2007

Artificial Intelligence

Computer Vision And Pattern Recognition

0

Paper

Artificial Intelligence

1,045

0

Save

0

A Comparative Study of Energy Minimization Methods for Markov Random Fields with Smoothness-Based Priors

Rick Szeliski et al.Apr 23, 2008

Among the most exciting advances in early vision has been the development of efficient energy minimization algorithms for pixel-labeling tasks such as depth or texture computation. It has been known for decades that such problems can be elegantly expressed as Markov random fields, yet the resulting energy minimization problems have been widely viewed as intractable. Algorithms such as graph cuts and loopy belief propagation (LBP) have proven to be very powerful: For example, such methods form the basis for almost all the top-performing stereo methods. However, the trade-offs among different energy minimization algorithms are still not well understood. In this paper, we describe a set of energy minimization benchmarks and use them to compare the solution quality and runtime of several common energy minimization algorithms. We investigate three promising methods-graph cuts, LBP, and tree-reweighted message passing-in addition to the well-known older iterated conditional mode (ICM) algorithm. Our benchmark problems are drawn from published energy functions used for stereo, image stitching, interactive segmentation, and denoising. We also provide a general-purpose software interface that allows vision researchers to easily switch between optimization methods. The benchmarks, code, images, and results are available at http://vision.middlebury.edu/MRF/.

Artificial Intelligence

Computer Vision And Pattern Recognition

0

Paper

Artificial Intelligence

979

0

Save

0

PatchMatch Stereo - Stereo Matching with Slanted Support Windows

Michael Bleyer et al.Jan 1, 2011

Common local stereo methods match support windows at integer-valued disparities. The implicit assumption that pixels within the support region have constant disparity does not hold for slanted surfaces and leads to a bias towards reconstructing frontoparallel surfaces. This work overcomes this bias by estimating an individual 3D plane at each pixel onto which the support region is projected. The major challenge of this approach is to find a pixel’s optimal 3D plane among all possible planes whose number is infinite. We show that an ideal algorithm to solve this problem is PatchMatch [1] that we extend to find an approximate nearest neighbor according to a plane. In addition to Patch-Match’s spatial propagation scheme, we propose (1) view propagation where planes are propagated among left and right views of the stereo pair and (2) temporal propagation where planes are propagated from preceding and consecutive frames of a video when doing temporal stereo. Adaptive support weights are used in matching cost aggregation to improve results at disparity borders. We also show that our slanted support windows can be used to compute a cost volume for global stereo methods, which allows for explicit treatment of occlusions and can handle large untextured regions. In the results we demonstrate that our method reconstructs highly slanted surfaces and achieves impressive disparity details with sub-pixel precision. In the Middlebury table, our method is currently top-performer among local methods and takes rank 2 among approximately 110 competitors if sub-pixel precision is considered.

Artificial Intelligence

Computer Vision And Pattern Recognition

0

Paper

Artificial Intelligence

643

0

Save

0

Fast Cost-Volume Filtering for Visual Correspondence and Beyond

Asmaa Hosni et al.Aug 1, 2012

Many computer vision tasks can be formulated as labeling problems. The desired solution is often a spatially smooth labeling where label transitions are aligned with color edges of the input image. We show that such solutions can be efficiently achieved by smoothing the label costs with a very fast edge-preserving filter. In this paper, we propose a generic and simple framework comprising three steps: 1) constructing a cost volume, 2) fast cost volume filtering, and 3) Winner-Takes-All label selection. Our main contribution is to show that with such a simple framework state-of-the-art results can be achieved for several computer vision applications. In particular, we achieve 1) disparity maps in real time whose quality exceeds those of all other fast (local) approaches on the Middlebury stereo benchmark, and 2) optical flow fields which contain very fine structures as well as large displacements. To demonstrate robustness, the few parameters of our framework are set to nearly identical values for both applications. Also, competitive results for interactive image segmentation are presented. With this work, we hope to inspire other researchers to leverage this framework to other application areas.

Artificial Intelligence

Biochemistry

0

Paper

Artificial Intelligence

615

0

Save

0

Cosegmentation of Image Pairs by Histogram Matching - Incorporating a Global Constraint into MRFs

Carsten Rother et al.Jul 10, 2006

We introduce the term cosegmentation which denotes the task of segmenting simultaneously the common parts of an image pair. A generative model for cosegmentation is presented. Inference in the model leads to minimizing an energy with an MRF term encoding spatial coherency and a global constraint which attempts to match the appearance histograms of the common parts. This energy has not been proposed previously and its optimization is challenging and NP-hard. For this problem a novel optimization scheme which we call trust region graph cuts is presented. We demonstrate that this framework has the potential to improve a wide range of research: Object driven image retrieval, video tracking and segmentation, and interactive image editing. The power of the framework lies in its generality, the common part can be a rigid/non-rigid object (or scene), observed from different viewpoints or even similar objects of the same class.

Artificial Intelligence

Computer Vision And Pattern Recognition

0

Paper

Artificial Intelligence

531

0

Save

0

DSAC — Differentiable RANSAC for Camera Localization

Eric Brachmann et al.Jul 1, 2017

RANSAC is an important algorithm in robust optimization and a central building block for many computer vision applications. In recent years, traditionally hand-crafted pipelines have been replaced by deep learning pipelines, which can be trained in an end-to-end fashion. However, RANSAC has so far not been used as part of such deep learning pipelines, because its hypothesis selection procedure is non-differentiable. In this work, we present two different ways to overcome this limitation. The most promising approach is inspired by reinforcement learning, namely to replace the deterministic hypothesis selection by a probabilistic selection for which we can derive the expected loss w.r.t. to all learnable parameters. We call this approach DSAC, the differentiable counterpart of RANSAC. We apply DSAC to the problem of camera localization, where deep learning has so far failed to improve on traditional approaches. We demonstrate that by directly minimizing the expected loss of the output camera poses, robustly estimated by RANSAC, we achieve an increase in accuracy. In the future, any deep learning pipeline can use DSAC as a robust optimization component.

Artificial Intelligence

Aerospace Engineering

0

Paper

Artificial Intelligence

528

0

Save

0

Uncertainty-Driven 6D Pose Estimation of Objects and Scenes from a Single RGB Image

Eric Brachmann et al.Jun 1, 2016

In recent years, the task of estimating the 6D pose of object instances and complete scenes, i.e. camera localization, from a single input image has received considerable attention. Consumer RGB-D cameras have made this feasible, even for difficult, texture-less objects and scenes. In this work, we show that a single RGB image is sufficient to achieve visually convincing results. Our key concept is to model and exploit the uncertainty of the system at all stages of the processing pipeline. The uncertainty comes in the form of continuous distributions over 3D object coordinates and discrete distributions over object labels. We give three technical contributions. Firstly, we develop a regularized, auto-context regression framework which iteratively reduces uncertainty in object coordinate and object label predictions. Secondly, we introduce an efficient way to marginalize object coordinate distributions over depth. This is necessary to deal with missing depth information. Thirdly, we utilize the distributions over object labels to detect multiple objects simultaneously with a fixed budget of RANSAC hypotheses. We tested our system for object pose estimation and camera localization on commonly used data sets. We see a major improvement over competing systems.

Artificial Intelligence

Paleontology

0

Paper

Artificial Intelligence

505

0

Save

0

Fast cost-volume filtering for visual correspondence and beyond

Christoph Rhemann et al.Jun 1, 2011

Many computer vision tasks can be formulated as labeling problems. The desired solution is often a spatially smooth labeling where label transitions are aligned with color edges of the input image. We show that such solutions can be efficiently achieved by smoothing the label costs with a very fast edge preserving filter. In this paper we propose a generic and simple framework comprising three steps: (i) constructing a cost volume (ii) fast cost volume filtering and (iii) winner-take-all label selection. Our main contribution is to show that with such a simple framework state-of-the-art results can be achieved for several computer vision applications. In particular, we achieve (i) disparity maps in real-time, whose quality exceeds those of all other fast (local) approaches on the Middlebury stereo benchmark, and (ii) optical flow fields with very fine structures as well as large displacements. To demonstrate robustness, the few parameters of our framework are set to nearly identical values for both applications. Also, competitive results for interactive image segmentation are presented. With this work, we hope to inspire other researchers to leverage this framework to other application areas.

Artificial Intelligence

Biochemistry

0

Paper

Artificial Intelligence

492

0

Save

0

Optimizing Binary MRFs via Extended Roof Duality

Carsten Rother et al.Jun 1, 2007

Many computer vision applications rely on the efficient optimization of challenging, so-called non-submodular, binary pairwise MRFs. A promising graph cut based approach for optimizing such MRFs known as "roof duality" was recently introduced into computer vision. We study two methods which extend this approach. First, we discuss an efficient implementation of the "probing" technique introduced recently by Bows et al. (2006). It simplifies the MRF while preserving the global optimum. Our code is 400-700 faster on some graphs than the implementation of the work of Bows et al. (2006). Second, we present a new technique which takes an arbitrary input labeling and tries to improve its energy. We give theoretical characterizations of local minima of this procedure. We applied both techniques to many applications, including image segmentation, new view synthesis, super-resolution, diagram recognition, parameter learning, texture restoration, and image deconvolution. For several applications we see that we are able to find the global minimum very efficiently, and considerably outperform the original roof duality approach. In comparison to existing techniques, such as graph cut, TRW, BP, ICM, and simulated annealing, we nearly always find a lower energy.

Artificial Intelligence

Theoretical Computer Science

0

Paper

Artificial Intelligence

442

0

Save