We show how a novel, non-linear representation of edge structure can be used to improve the performance of model matching algorithms and object verification/recognition tasks. Rather than represent the image structure...
详细信息
ISBN:
(纸本)0769512720
We show how a novel, non-linear representation of edge structure can be used to improve the performance of model matching algorithms and object verification/recognition tasks. Rather than represent the image structure using intensity values or gradients, we use a measure which indicates the orientation of structures at each pixel, together with an indication of how reliable the orientation estimate is. Orientations in flat, noisy regions tend to be penalised whereas those near strong edges are favoured. We demonstrate that this representation leads to more accurate and reliable matching between models and new images, and leads to better recognition/verification of faces in an access control task.
A novel approach to face recognition based on a multi-pose image sequence is presented in this paper. In this approach, faces are represented by their pattern vectors (projections to eigenfaces) in eigenspace. Instead...
详细信息
ISBN:
(纸本)9539676940
A novel approach to face recognition based on a multi-pose image sequence is presented in this paper. In this approach, faces are represented by their pattern vectors (projections to eigenfaces) in eigenspace. Instead of recognising a face from a single view, a sequence of images showing face movement (from left to the right profile) is used for recognition. pattern vectors corresponding to multiple poses build a trajectory in eigenspace where each trajectory belongs to one face sequence (profile to profile). In the training phase, sequences of poses construct prototype trajectories, and in recognition phase, an unknown face trajectory is compared with prototypes. New matching models are presented and analysed as well as the influence of some parameters on the recognition ratio.
This paper concerns the segmentation of successive frames of a video sequence. Traditional methods, treating each frame in isolation, are computationally expensive, ignore potentially useful information derived from p...
详细信息
ISBN:
(纸本)0769512720
This paper concerns the segmentation of successive frames of a video sequence. Traditional methods, treating each frame in isolation, are computationally expensive, ignore potentially useful information derived from previous frames, and can lead to instabilities over the sequence. The approach developed here, based on the Region Competition algorithm (Zhu and Yuille, ieee Trans. PAMI, 1996), employs a mesh of active contour primitives supervised by an MDL energy criterion. Temporal extensions, namely Boundary Momentum, Region Memory, and Optical Boundary Flow, are developed to ease the transition between successive frames. Further enhancements are made by incorporating mechanisms to accommodate the topological discontinuities that can arise during the sequence. The algorithm is demonstrated using a number of synthetic and real video sequences and is shown to provide an efficient method of segmentation which encourages stability across frames and preserves the quality of the original segmentation over the sequence.
Given a collection of images (matrices) representing a "class" of objects we present a method for extracting the commonalities of the image space directly from the matrix representations (rather than from th...
详细信息
ISBN:
(纸本)0769512720
Given a collection of images (matrices) representing a "class" of objects we present a method for extracting the commonalities of the image space directly from the matrix representations (rather than from the vectorized representation which one would normally do in a PCA approach, for example). The general idea is to consider the collection of matrices as a tensor and to look for an approximation of its tensor-rank. The tensor-rank approximation is designed such that the SVD decomposition emerges in the special case where all the input matrices are the repeatition of a single matrix. We evaluate the coding technique both in terms of regression, i.e., the efficiency of the technique for functional approximation, and classification. We find that for regression the tensor-rank coding, as a dimensionality reduction technique, significantly outperforms other techniques like PCA. As for classification, the tensor-rank coding is at is best when the number of training examples is very small.
We propose a new method for unsupervised face recognition from time-varying sequences of face images obtained in real-world environments. Two types of forces, attraction and repulsion, operate across the spatio-tempor...
详细信息
ISBN:
(纸本)0769512720
We propose a new method for unsupervised face recognition from time-varying sequences of face images obtained in real-world environments. Two types of forces, attraction and repulsion, operate across the spatio-temporal facial manifolds, to autonomously organize the data without relying on any category-specific information provided in advance. Experiments with real-world data gathered over a period of several months and including both frontal and side-view faces were used to evaluate the method and encouraging results were obtained The proposed method can be used in video surveillance systems or for content-based information retrieval.
For motion picture special effects, it is often necessary to take a source image of an actor, segment the actor from the unwanted background, and then composite over a new background. The standard approach requires th...
详细信息
ISBN:
(纸本)0769512720
For motion picture special effects, it is often necessary to take a source image of an actor, segment the actor from the unwanted background, and then composite over a new background. The standard approach requires the unwanted background to be a blue screen. While this technique is capable of handling areas where the foreground blends into the background, the physical requirements present many practical problems. This paper presents an algorithm that requires minimal human interaction to segment motion picture resolution images and image sequences. We show that it can be used not only to segment badly lit or noisy blue screen images, but also to segment actors where the background is more varied.
One of the most popular methods to extract useful information from an image sequence is the template matching approach. In this well known method the tracking of a certain feature or target over time is based on the c...
详细信息
ISBN:
(纸本)0769512720
One of the most popular methods to extract useful information from an image sequence is the template matching approach. In this well known method the tracking of a certain feature or target over time is based on the comparison of the content of each image with a sample template. We propose a 3D template matching algorithm that is able to track targets corresponding to the projection of 3D surfaces. With only a few hundred subtractions and multiplications per frame, our algorithm provides, in real time, an estimation of the 3D surface pose. The key idea is to compute the difference between the current image content and the visual aspect of the target under the predicted spatial attitude. This difference image is converted into corrections on the 3D location parameters.
We develop a view-normalization approach to multi-view face and gait recognition. An image-based visual hull (IBVH) is computed from a set of monocular views and used to render virtual views for tracking and recogniti...
详细信息
ISBN:
(纸本)0769512720
We develop a view-normalization approach to multi-view face and gait recognition. An image-based visual hull (IBVH) is computed from a set of monocular views and used to render virtual views for tracking and recognition. We determine canonical viewpoints by examining the 3D structure, appearance (texture), and motion of the moving person. For optimal face recognition, we place virtual cameras to capture frontal face appearance; for gait recognition we place virtual cameras to capture a side-view of the person. Multiple cameras can be rendered simultaneously, and camera position is dynamically updated as the person moves through the workspace. image sequences from each canonical view are passed to an unmodified face or gait recognition algorithm. We show that our approach provides greater recognition accuracy than is obtained using the unnormalized input sequences, and that integrated face and gait recognition provides improved performance over either modality alone. Canonical view estimation, rendering, and recognition have been efficiently implemented and can run at near real-time speeds.
Maximization of mutual information is a powerful method for registering images (and other data) captured with different sensors or under varying conditions, since the technique is robust to variations in the image for...
详细信息
ISBN:
(纸本)0769512720
Maximization of mutual information is a powerful method for registering images (and other data) captured with different sensors or under varying conditions, since the technique is robust to variations in the image formation process. On the other hand, the high level of robustness allows false positives when matching over a large search space and also makes it difficult to formulate an efficient search strategy for this case. We describe techniques to overcome these problems by aligning image entropies, which are robust to illumination variation and can be applied to multi-sensor registration. This results in a lower rate of false positives and a more efficient method to search an image for the matching position. The techniques are applied to real imagery and compared to methods based on mutual information and gradients to demonstrate their effectiveness.
There have been important recent advances in object recognition through the matching of invariant local image features. However, the existing approaches are based on matching to individual training images. This paper ...
详细信息
ISBN:
(纸本)0769512720
There have been important recent advances in object recognition through the matching of invariant local image features. However, the existing approaches are based on matching to individual training images. This paper presents a method for combining multiple images of a 3D object into a single model representation. This provides for recognition of 3D objects from any viewpoint, the generalization of models to non-rigid changes, and improved robustness through the combination of features acquired under a range of imaging conditions. The decision of whether to cluster a training image into an existing view representation or to treat it as a new view is based on the geometric accuracy of the match to previous model views. A new probabilistic model is developed to reduce the false positive matches that would otherwise arise due to loosened geometric constraints on matching 3D and non-rigid models. A system has been developed based on these approaches that is able to robustly recognize 3D objects in cluttered natural images in sub-second times.
暂无评论