This paper presents a trainable object detection architecture that is applied to detecting people in static images of cluttered scenes. This problem poses several challenges. People are highly non-rigid objects with a...
详细信息
ISBN:
(纸本)0780342364
This paper presents a trainable object detection architecture that is applied to detecting people in static images of cluttered scenes. This problem poses several challenges. People are highly non-rigid objects with a high degree of variability in size, shape, color, and texture. Unlike previous approaches, this system learns from examples and does not rely on any a priori (handcrafted) models or on motion. The detection technique is based on the novel idea of the wavelet template that defines the shape of an object in terms of a subset of the wavelet coefficients of the image. It is invariant to changes in color and texture and can be used to robustly define a rich and complex class of objects such as people. We show how the invariant properties and computational efficiency of the wavelet template make it an effective tool for object detection.
This paper presents a new algorithm for detecting objects in images, one of the fundamental tasks of computervision. The algorithm extends the representational efficiency of eigenimage methods to binary features, whi...
详细信息
ISBN:
(纸本)0780342364
This paper presents a new algorithm for detecting objects in images, one of the fundamental tasks of computervision. The algorithm extends the representational efficiency of eigenimage methods to binary features, which are less sensitive to illumination changes than gray-level values normally used with eigenimages. Binary features (square subtemplates) are automatically chosen on each training image. Using features rather than whole templates makes the algorithm more robust to background clutter and partial occlusions. Instead of representing the features with real-valued eigenvector principle components, we use binary vector quantization to avoid floating point computations. The object is defected in the image using a simple geometric hash table and Hough transform. On a rest of 1000 images, the algorithm works on 99.3%. We present a theoretical analysis of the algorithm in terms of the receiver operating characteristic, which consists of the probabilities of detection and false alarm. We verify this analysis with the results of our 1000-image test, and we use the analysis as a principled way to select some of the algorithm's important operating parameters.
The problem of a moving robot tracking a moving object with its cameras, without requiring the ability to recognize the target to distinguish it from distracting surroundings, is examined. A novel aspect of the approa...
详细信息
We describe a novel technique for face recognition based on deformable intensity surfaces which incorporates both the shape and texture components of the 2D image. The intensity surface of the facial image is modeled ...
详细信息
ISBN:
(纸本)0818672587
We describe a novel technique for face recognition based on deformable intensity surfaces which incorporates both the shape and texture components of the 2D image. The intensity surface of the facial image is modeled as a deformable 3D mesh in (x, y, I(x, y)) space. Using an efficient technique for matching two surfaces (in terms of the analytic modes of vibration), we obtain a dense correspondence field (or 3D warp) between two images. The probability distributions of two classes of warps are then estimated from training data: interpersonal and extrapersonal variations. These densities are then used in a Bayesian framework for image matching and recognition. Experimental results with facial data from the US Army FERET database demonstrate an increased recognition rate over the previous best methods.
A method is described for the determination of the viewing parameters of randomly acquired projections of asymmetric objects. It extends upon the common lines algorithm by determining the relative orientation of proje...
详细信息
ISBN:
(纸本)0818672587
A method is described for the determination of the viewing parameters of randomly acquired projections of asymmetric objects. It extends upon the common lines algorithm by determining the relative orientation of projections from the location of lines of intersection among the Fourier transforms of the projections in three-dimensional Fourier space. A new technique for finding the lines of intersection in the presence of translational displacement, and for subsequently finding the translational displacement, is presented. A new technique for dealing with noise is also presented. The complete algorithm is described and its efficacy is demonstrated using real data. This technique may be applied to the three-dimensional reconstruction of viruses, molecules, and cells from in vivo images. It also has many other applications including the reconstruction of underwater scenes, radioastronomy, geoseismic analysis, and portable radiography for medical diagnosis and industrial inspection.
The high-resolution field of view of the human eye only covers a tiny fraction of the total field of view, which allows for great economy in computational resources but forces the visual system to solve other problems...
详细信息
Despite many successful applications of robust statistics, they have yet to be completely adapted to many computervision problems. Range reconstruction, particularly in unstructured environments, requires a robust es...
详细信息
ISBN:
(纸本)0818672587
Despite many successful applications of robust statistics, they have yet to be completely adapted to many computervision problems. Range reconstruction, particularly in unstructured environments, requires a robust estimator that not only tolerates a large outlier percentage but also tolerates several discontinuities, extracting multiple surfaces in an image region. Observing that random outliers and/or points from across discontinuities increase a hypothesized fit's scale estimate (standard deviation of the noise), our new operator, called MUSE (Minimum Unbiased Scale Estimator), evaluates a hypothesized fit over potential inlier sets via an objective function of unbiased scale estimates. MUSE extracts the single best fit from the data by minimizing its objective function over a set of hypothesized fits and can sequentially extract multiple surfaces from an image region. We show MUSE to be effective on synthetic data modelling small scale discontinuities and in preliminary experiments on complicated range data.
Two morphological methods for edge detection in range images are proposed. The first method uses the opening and closing residues of structuring elements in orthogonal directions to detect roof and crease edges, and i...
详细信息
This paper presents a formal model of an active recognition system that can be programmed by learning. At each time step the system decides between producing an action to generate new data and stopping to issue the na...
详细信息
ISBN:
(纸本)0818672587
This paper presents a formal model of an active recognition system that can be programmed by learning. At each time step the system decides between producing an action to generate new data and stopping to issue the name of the object observed. The actions can be directed either towards the external environment or towards the internal perceptual system of the agent. The decision strategy is based on a quantitative evaluation of the system learning experience. The problem studied is the recognition of chess pieces using a moving camera and a multiscale feature detector. The recognition is difficult because the objects are complex - neither polyhedral nor smooth - and rather similar between classes, especially in certain view configurations. The system uses the information obtained by observing internal state transitions when the camera is moved or when the feature detector scale is changed. A simulation of the agent and the environment is used for experimental measures of the model performances.
The authors examine how estimates of three-dimensional scene structure, as encoded in a scene disparity map, can be improved by the analysis of the original monocular imagery. They describe the utilization of surface ...
详细信息
暂无评论