We present the Incremental Focus of Attention (IFA) architecture for adding robustness to software-based, real-time, motion trackers. The framework provides a structure which, when given the entire camera image to sea...
详细信息
ISBN:
(纸本)0818672587
We present the Incremental Focus of Attention (IFA) architecture for adding robustness to software-based, real-time, motion trackers. The framework provides a structure which, when given the entire camera image to search, efficiently focuses the attention of the system into a narrow set of possible states that includes the target state. IFA offers a means for automatic tracking initialization and reinitialization when environmental conditions momentarily deteriorate and cause the system to lose track of its target. Systems based on the framework degrade gracefully as various assumptions about the environment are violated. In particular, multiple tracking algorithms are layered so that the failure of a single algorithm causes another algorithm of less precision to take over, thereby allowing the system to return approximate feature state information.
Unconstrained illumination and pose variation lead to significant variation in the photographs of faces and constitute a major hurdle preventing the widespread use of face recognition systems. The challenge is to gene...
详细信息
Unconstrained illumination and pose variation lead to significant variation in the photographs of faces and constitute a major hurdle preventing the widespread use of face recognition systems. The challenge is to generalize from a limited number of images of an individual to a broad range of conditions. Recently, advances in modeling the effects of illumination and pose have been accomplished using three-dimensional (3-D) shape information coupled with reflectance models. Notable developments in understanding the effects of illumination include the nonexistence of illumination invariants, a characterization of the set of images of objects in fixed pose under variable illumination (the illumination cone), and the introduction of spherical harmonics and low-dimensional linear subspaces for modeling illumination. To generalize to novel conditions, either multiple images must be available to reconstruct 3-D shape or, if only a single image is accessible, prior information about the 3-D shape and appearance of faces in general must be used. The 3-D Morphable Model was introduced as a generative model to predict the appearances of an individual while using a statistical prior on shape and texture allowing its parameters to be estimated from single image. Based on these new understandings, face recognition algorithms have been developed to address the joint challenges of pose and lighting. in this paper, we review these developments and provide a brief survey of the resulting face recognition algorithms and their performance.
The problem of finding the closest point in high-dimensional spaces is common in computational vision. Unfortunately, the complexity of most existing search algorithms, such as k-d tree and R-tree, grows exponentially...
详细信息
ISBN:
(纸本)0818672587
The problem of finding the closest point in high-dimensional spaces is common in computational vision. Unfortunately, the complexity of most existing search algorithms, such as k-d tree and R-tree, grows exponentially with dimension, making them impractical for dimensionality above 15. In nearly all applications, the closest point is of interest only if it lies within a user specified distance ε. We present a simple and practical algorithm to efficiently search for the nearest neighbor within Euclidean distance ε. Our algorithm uses a projection search technique along with a novel data structure to dramatically improve performance in high dimensions. A complexity analysis is presented which can help determine ε in structured problems. Benchmarks clearly show the superiority of the proposed algorithm for high dimensional search problems frequently encountered in machine vision, such as real-time object recognition.
The paper presents an analysis of the stability of pose estimation. The investigated pose estimation technique is based on orientations of three edge segments and provides the rotation part of object pose. The specifi...
详细信息
ISBN:
(纸本)0818672587
The paper presents an analysis of the stability of pose estimation. The investigated pose estimation technique is based on orientations of three edge segments and provides the rotation part of object pose. The specific emphasis of the analysis is on determining how the stability varies with view point relative to an object. The stability investigation propagates the uncertainty in edge segment orientations to the resulting effect on the pose parameters. It is shown that there is a very strong variation in noise sensitivity over the range of viewpoints and that exactly what viewpoints offer highest robustness towards noise can be determined in advance. Experiments on real images verify the theoretical results and show that, dependent on viewpoint, pose parameter variance varies from 0.05 to 20 (degrees squared).
This paper introduces a new method for object recognition which is based on a recurrent neural network trained in a supervised mode. The RNN inputs 3-dimensional laser scanner data sequentially, in a natural temporal ...
详细信息
ISBN:
(纸本)9781424439942
This paper introduces a new method for object recognition which is based on a recurrent neural network trained in a supervised mode. The RNN inputs 3-dimensional laser scanner data sequentially, in a natural temporal order in which the laser returns arrive to the scanner The method is illustrated on a two-class problem with real data.
This paper proposes a method for detecting obstacles on a runway by controlling their expected disparities. By approximating the runway by a planar surface, the initial model flow field (MFF) corresponding to an obsta...
详细信息
ISBN:
(纸本)0818672587
This paper proposes a method for detecting obstacles on a runway by controlling their expected disparities. By approximating the runway by a planar surface, the initial model flow field (MFF) corresponding to an obstacle-free runway is described by the data from onboard sensors (OBS). The error variance of the initial MFF is computed and used to estimate the MFF. Obstacles are detected by comparing the expected residual flow disparities with the residual flow field (RFF) estimated after warping (or stabilizing) an image using the MFF. Expected temporal and spatial disparities are obtained from the use of the OBS. This allows us to control the residual disparities by increasing the temporal baseline and/or by utilizing the spatial baseline if distant objects cannot be detected for a given temporal baseline. Experimental results for two real flight image sequences are presented.
A complete scheme for totally unconstrained handwritten word recognition based on a single contextual hidden Markov model (HMM) is proposed. The scheme includes a morphology- and heuristics-based segmentation algorith...
详细信息
The success of an intelligent robotic system depends on the performance of its vision-system which in turn depends to a great extend upon the quality of its calibration. During the execution of a task the vision-syste...
详细信息
ISBN:
(纸本)0780342364
The success of an intelligent robotic system depends on the performance of its vision-system which in turn depends to a great extend upon the quality of its calibration. During the execution of a task the vision-system is subject to external influences such as vibrations, thermal expansion etc. which affect and possibly render invalid the initial calibration. Moreover it is possible that the parameters of the vision-system like e.g. the zoom or the focus are altered intentionally in order to perform specific vision-tasks. This paper describes a technique for automatically maintaining calibration of stereovision systems over time without using again any particular calibration apparatus. It uses all available information, i.e. both spatial and temporal data. Uncertainty is systematically manipulated and maintained. Synthetical and real data are used to validate the proposed technique, and the results compare very favourably with those given by classical calibration methods.
In many vision problems, we want to infer two (or more) hidden factors which interact to produce our observations. We may want to disentangle illuminant and object colors in color constancy;rendering conditions from s...
详细信息
ISBN:
(纸本)0780342364
In many vision problems, we want to infer two (or more) hidden factors which interact to produce our observations. We may want to disentangle illuminant and object colors in color constancy;rendering conditions from surface shape in shape-from-shading;face identity and head pose in face recognition;or font and letter class in character recognition. We refer to these two factors generically as ''style'' and ''content''. Bilinear models offer a powerful framework for extracting the two-factor structure of a set of observations, and are familiar in computational vision from several well-known lines of research. This paper shows how bilinear models can be used to learn the style-content structure of a pattern analysis or synthesis problem, which can then be generalized to solve related tasks using different styles and/or content. We focus on three tasks: extrapolating the style of data to unseen content classes, classifying data with known content under a novel style, and translating data from novel content classes and style to a known style or content. We show examples from color constancy, face pose estimation, shape-from-shading, typography and speech.
We study occluding contour artifacts in area-based stereo matching: they are false responses of the matching operator to the occlusion boundary and cause the objects extend beyond their true boundaries in disparity ma...
详细信息
ISBN:
(纸本)0780342364
We study occluding contour artifacts in area-based stereo matching: they are false responses of the matching operator to the occlusion boundary and cause the objects extend beyond their true boundaries in disparity maps. Most of the matching methods suffer from these artifacts;the effect is so strong that it cannot be ignored. We show what gives rise to the artifacts and design a matching criterion that accommodates the presence of occlusions as opposed to methods that identify and remove the artifacts. This approach leads to the problem of measurement contamination studied in statistics. We show that such a problem is hard given finite computational resources, unless more independent measurements directly related to occluding contours is available. What can be achieved is a substantial reduction of the artifacts, especially for large matching templates. Reduced artifacts allow for easier hierarchical matching and for easy fusion of reconstructions from different viewpoints into a coherent whole.
暂无评论