We present a focus-based method to recover the orientation of a textured planar surface patch from a single image. The method exploits the relationship between the orientation of equifocal (i.e. uniformly-blurred) con...
详细信息
ISBN:
(纸本)9781424439928
We present a focus-based method to recover the orientation of a textured planar surface patch from a single image. The method exploits the relationship between the orientation of equifocal (i.e. uniformly-blurred) contours in the image and the plane's tilt and slant angles. Compared to previous methods that determine planar orientation, we make fewer assumptions about the texture and remove the restriction that images must be acquired through a pinhole aperture. Our method estimates slant and tilt of an image patch in a single image, as compared to depth from defocus methods that require two or more input images. Experiments are performed using a large set of test images.
We present a system that combines multiple visual navigation techniques to achieve GPS-denied, non-line-of-sight SLAM capability for heterogeneous platforms. Our approach builds on several layers of vision algorithms,...
详细信息
ISBN:
(纸本)9781424439928
We present a system that combines multiple visual navigation techniques to achieve GPS-denied, non-line-of-sight SLAM capability for heterogeneous platforms. Our approach builds on several layers of vision algorithms, including sparse frame-to-frame structure from motion (visual odometry), a Kalman filter for fusion with inertial measurement unit (IMU) data and a distributed visual landmark matching capability with geometric consistency verification. We apply these techniques to implement a tag-along robot, where a human operator leads the way and a robot autonomously follows. We show results for a real-time implementation of such a system with real field constraints on CPU power and network resources.
We propose a novel approach to designing algorithms for object tracking based on fusing multiple observation models. As the space of possible observation models is too large for exhaustive on-line search, this work ai...
详细信息
ISBN:
(纸本)9781424439928
We propose a novel approach to designing algorithms for object tracking based on fusing multiple observation models. As the space of possible observation models is too large for exhaustive on-line search, this work aims to select models that are suitable for a particular tracking task at hand. During an off-line training stage observation models from various off-the-shelf trackers are evaluated. From this data different methods of fusing the observers on-line are investigated, including parallel and cascaded evaluation. Experiments on test sequences show that this evaluation is useful for automatically designing and assessing algorithms for a particular tracking task. Results are shown for face tracking with a handheld camera and hand tracking for gesture interaction. We show that for these cases combining a small number of observers in a sequential cascade results in efficient algorithms that are both robust and precise.
Surveillance system involving hundreds of cameras becomes very popular. Due to various positions and orientations of camera, object appearance changes dramatically in different scenes. Traditional appearance based obj...
详细信息
ISBN:
(纸本)9781424439942
Surveillance system involving hundreds of cameras becomes very popular. Due to various positions and orientations of camera, object appearance changes dramatically in different scenes. Traditional appearance based object classification methods tend to fail under these situations. We approach the problem by designing an adaptive object classification framevvork which automatically adjust to different scenes. Firstly, a baseline object classifier is applied to specific scene, generating training samples with extracted scene-specific features (such as object position). Based on that, bilateral weighted LDA is trained under the guide of sample confidence. Moreover we propose a bayesian classifier based method to detect and remove outliers to cope with contingent generalization disaster resulted from utilizing high confidence but incorrectly classified training samples. To validate these ideas, we realize the framework into an intelligent surveillance system. Experimental results demonstrate the effectiveness of this adaptive object classification framework.
Object localization and recognition are important problems in computervision. However in many applications, exhaustive search over all object models and image locations is computationally prohibitive. While several m...
详细信息
ISBN:
(纸本)9781424439928
Object localization and recognition are important problems in computervision. However in many applications, exhaustive search over all object models and image locations is computationally prohibitive. While several methods have been proposed to make either recognition or localization more efficient, few have dealt with both tasks simultaneously. This paper proposes an efficient method for concurrent object localization and recognition based on a data-dependent multi-class branch-and-bound formalism. Existing bag-of features recognition techniques which can be expressed as weighted combinations of feature counts can be readily adapted to our method. We present experimental results that demonstrate the merit of our algorithm in terms of recognition accuracy, localization accuracy and speed. compared to baseline approaches including exhaustive search, implicit-shape model (ISM), and efficient subwindow search (ESS). Moreover we develop two extensions to consider non-rectangular bounding regions composite boxes and polygons and demonstrate their ability to achieve higher recognition scores compared to traditional rectangular bounding boxes.
We address the problem of label assignment in computervision: given a novel 3-D or 2-D scene, we wish to assign a unique label to every site (voxel, pixel, superpixel, etc.). To this end, the Markov Random Field fram...
详细信息
ISBN:
(纸本)9781424439928
We address the problem of label assignment in computervision: given a novel 3-D or 2-D scene, we wish to assign a unique label to every site (voxel, pixel, superpixel, etc.). To this end, the Markov Random Field framework has proven to be a model of choice as it uses contextual information to yield improved classification results over locally independent classifiers. In this work we adapt a functional gradient approach for learning high-dimensional parameters of random fields in order to perform discrete, multi-label classification. With this approach we can learn robust models involving high-order interactions better than the previously used learning method. We validate the approach in the context of point cloud classification and improve the state of the art. In addition, we successfully demonstrate the generality of the approach on the challenging vision problem of recovering 3-D geometric surfaces from images.
Markov random field (MRF, CRF) models are popular in computervision. However in order to be computationally tractable they are limited to incorporate only local interactions and cannot model global properties, such a...
详细信息
ISBN:
(纸本)9781424439928
Markov random field (MRF, CRF) models are popular in computervision. However in order to be computationally tractable they are limited to incorporate only local interactions and cannot model global properties, such as connectedness, which is a potentially useful high-level prior for object segmentation. In this work, we overcome this limitation by deriving a potential function that enforces the output labeling to be connected and that can naturally be used in the framework of recent MAP-MRF LP relaxations. Using techniques from polyhedral combinatorics, we show that a provably tight approximation to the MAP solution of the resulting MRF can still be found efficiently by solving a sequence of max-flow problems. The efficiency of the inference procedure also allows us to learn the parameters of a MRF with global connectivity potentials by means of a cutting plane algorithm. We experimentally evaluate our algorithm on both synthetic data and on the challenging segmentation task of the PASCAL VOC 2008 data set. We show that in both cases the addition of a connectedness prior significantly reduces the segmentation error.
Extremely crowded scenes present unique challenges to video analysis that cannot be addressed with conventional approaches. We present a novel statistical framework for modeling the local spatio-temporal motion patter...
详细信息
ISBN:
(纸本)9781424439928
Extremely crowded scenes present unique challenges to video analysis that cannot be addressed with conventional approaches. We present a novel statistical framework for modeling the local spatio-temporal motion pattern behavior of extremely crowded scenes. Our key insight is to exploit the dense activity of the crowded scene by modeling the rich motion patterns in local areas, effectively capturing the underlying intrinsic structure they form in the video. In other words, we model the motion variation of local space-time volumes and their spatial-temporal statistical behaviors to characterize the overall behavior of the scene. We demonstrate that by capturing the steady-state motion behavior with these spatio-temporal motion pattern models, we can naturally detect unusual activity as statistical deviations. Our experiments show that local spatio-temporal motion pattern modeling offers promising results in real-world scenes with complex activities that are hard for even human observers to analyze.
Structured outputs such as multidimensional vectors or graphs are frequently encountered in real world patternrecognition applications such as computervision, natural language processing or computational biology. Th...
详细信息
ISBN:
(纸本)9781424439928
Structured outputs such as multidimensional vectors or graphs are frequently encountered in real world patternrecognition applications such as computervision, natural language processing or computational biology. This motivates the learning of functional dependencies between spaces with complex, interdependent inputs and outputs, as arising e.g. from images and their corresponding 3d scene representations. In this spirit, we propose a new structured learning method Structured Output-Associative Regression (SOAR) that models not only the input-dependency but also the self-dependency of outputs, in order to provide an output re-correlation mechanism that complements the (more standard) input-based regressive prediction. The model is simple but powerful, and, in principle, applicable in conjunction with any existing regression algorithms. SOAR can be kernelized to deal with non-linear problems and learning is efficient via primal/dual formulations not unlike ones used for kernel ridge regression or support vector regression. We demonstrate that the method outperforms weighted nearest neighbor and regression methods for the reconstruction of images of handwritten digits and for 3D human pose estimation from video in the HumanEva benchmark.
In factorization approaches to nonrigid structure from motion, the 3D shape of a deforming object is usually modeled as a linear combination of a small number of basis shapes. The original approach to simultaneously e...
详细信息
ISBN:
(纸本)9781424439928
In factorization approaches to nonrigid structure from motion, the 3D shape of a deforming object is usually modeled as a linear combination of a small number of basis shapes. The original approach to simultaneously estimate the shape basis and nonrigid structure exploited orthonormality constraints for metric rectification. Recently it has been asserted that structure recovery through orthonormality constraints alone is inherently ambiguous and cannot result in a unique solution. This assertion has been accepted as conventional wisdom and is the justification of many remedial heuristics in literature. Our key contribution is to prove that orthonormality constraints are in fact sufficient to recover the 3D structure from image observations alone. We characterize the true nature of the ambiguity in using orthonormality constraints for the shape basis and show that it has no impact on structure reconstruction. We conclude from our experimentation that the primary challenge in using shape basis for nonrigid structure from motion is the difficulty in the optimization problem rather than the ambiguity in orthonormality constraints.
暂无评论