This paper describes the DeepSea Stereo vision System which makes the use of high speed 3D images practical in many application domains. This system is based on the DeepSea processor, an ASIC, which computes absolute ...
详细信息
We propose a variational algorithm to jointly estimate the shape, albedo, and light configuration of a Lambertian scene from a collection of images taken from different vantage points. Our work can be thought of as ex...
详细信息
We propose a variational algorithm to jointly estimate the shape, albedo, and light configuration of a Lambertian scene from a collection of images taken from different vantage points. Our work can be thought of as extending classical multi-view stereo to cases where point correspondence cannot be established, or extending classical shape from shading to the case of multiple views with unknown light sources. We show that a first naive formalization of this problem yields algorithms that are numerically unstable, no matter how close the initialization is to the true geometry. We then propose a computational scheme to overcome this problem, resulting in provably stable algorithms that converge to (local) minima of the cost functional. Although we restrict our attention to Lambertian objects with uniform albedo, extensions of our framework are conceivable.
Recently, an optimization approach for fast visual tracking of articulated structures based on Stochastic Meta-Descent (SMD) has been presented. SMD is a gradient descent with local step size adaptation that combines ...
详细信息
A light field consists of images of a scene taken from different viewpoints. Light fields are used in computer graphics for image-based rendering and synthetic aperture photography, and in vision for recovering shape....
详细信息
A light field consists of images of a scene taken from different viewpoints. Light fields are used in computer graphics for image-based rendering and synthetic aperture photography, and in vision for recovering shape. In this paper, we describe a simple procedure to calibrate camera arrays used to capture light fields using a plane + parallax framework. Specifically, for the case when the cameras lie on a plane, we show (i) how to estimate camera positions up to an affine ambiguity, and (ii) how to reproject light field images onto a family of planes using only knowledge of planar parallax for one point in the scene. While planar parallax does not completely describe the geometry of the light field, it is adequate for the first two applications which, it turns out, do not depend on having a metric calibration of the light field. Experiments on acquired light fields indicate that our method yields better results than full metric calibration.
Williams and Titsias (2004) have shown how to carry out unsupervised greedy learning of multiple objects from images (GLOMO), building on the work of Jojic and Frey (2001). In this paper we show that the earlier work ...
详细信息
This paper describes how an ontology consisting of a ground truth schema and a set of annotated training sequences can be used to train the structure and parameters of Bayesian networks for event recognition. It is sh...
详细信息
This paper introduces "Flocks of Features," a fast tracking method for non-rigid and highly articulated objects such as hands. It combines KLT features and a learned foreground color distribution to facilita...
详细信息
We aim to infer 3D body pose directly from human silhouettes. Given a visual input (silhouette), the objective is to recover the intrinsic body configuration, recover the view point, reconstruct the input and detect a...
详细信息
We aim to infer 3D body pose directly from human silhouettes. Given a visual input (silhouette), the objective is to recover the intrinsic body configuration, recover the view point, reconstruct the input and detect any spatial or temporal outliers. In order to recover intrinsic body configuration (pose) from the visual input (silhouette), we explicitly learn view-based representations of activity manifolds as well as learn mapping functions between such central representations and both the visual input space and the 3D body pose space. The body pose can be recovered in a closed form in two steps by projecting the visual input to the learned representations of the activity manifold, i.e., finding the point on the learned manifold representation corresponding to the visual input, followed by interpolating 3D pose.
We propose an algorithm for articulated human motion segmentation that estimates parametric motions of body parts and segments images into moving regions accordingly. Our approach combines robust optical flow estimati...
详细信息
Markov random field models provide a robust and unified framework for early vision problems such as stereo, optical flow and image restoration. Inference algorithms based on graph cuts and belief propagation yield acc...
详细信息
Markov random field models provide a robust and unified framework for early vision problems such as stereo, optical flow and image restoration. Inference algorithms based on graph cuts and belief propagation yield accurate results, but despite recent advances are often still too slow for practical use. In this paper we present new algorithmic techniques that substantially improve the running time of the belief propagation approach. One of our techniques reduces the complexity of the inference algorithm to be linear rather than quadratic in the number of possible labels for each pixel, which is important for problems such as optical flow or image restoration that have a large label set. A second technique makes it possible to obtain good results with a small fixed number of message passing iterations, independent of the size of the input images. Taken together these techniques speed up the standard algorithm by several orders of magnitude. In practice we obtain stereo, optical flow and image restoration algorithms that are as accurate as other global methods (e.g., using the Middlebury stereo benchmark) while being as fast as local techniques.
暂无评论