Capturing real motions from video sequences is a powerful method for automatic building of facial articulation models. In this paper, we propose an explanation-based facial motion tracking algorithm based on a piecewi...
详细信息
ISBN:
(纸本)769501494
Capturing real motions from video sequences is a powerful method for automatic building of facial articulation models. In this paper, we propose an explanation-based facial motion tracking algorithm based on a piecewise Bezier volume deformation model (PBVD). The PBVD is a suitable model both for the synthesis and the analysis of facial images. It is linear and independent of the facial mesh structure. With this model, basic facial movements, or action units, are interactively defined. By changing the magnitudes of these action units, animated facial images are generated. The magnitudes of these action units can also be computed from real video sequences using a model-based tracking algorithm. However, in order to customize the articulation model for a particular face, the predefined PBVD action units need to be adaptively modified. In this paper, we first briefly introduce the PBVD model and its application in facial animation. Then a multiresolution PBVD-based motion tracking algorithm is presented. Finally, we describe an explanation-based tracking algorithm that takes the predefined action units as the initial articulation model and adaptively improves them during the tracking process to obtain a more realistic articulation model. Experimental results on PBVD-based animation, model-based tracking, and explanation-based tracking are shown in this paper.
This paper presents a novel approach for detection and segmentation of generic shapes in cluttered images. The underlying assumption is that generic objects that are man made, frequently have surfaces which closely re...
详细信息
ISBN:
(纸本)769501494
This paper presents a novel approach for detection and segmentation of generic shapes in cluttered images. The underlying assumption is that generic objects that are man made, frequently have surfaces which closely resemble standard model shapes such as rectangles, semi-circles etc. Due to the perspective transformations of optical imaging systems, a model shape may appear differently in the image with various orientations and aspect ratios. The set of possible appearances can be represented compactly by a few vectorial eigenbases that are derived from a small set of model shapes which are affine transformed in a wide parameter range. Instead of regular boundary of standard models, we apply a vectorial boundary which improves robustness to noise, background clutter and partial occlusion. The detection of generic shapes is realized by detecting local peaks of a similarity measure between the image edge map and an eigenspace combined set of the appearances. At each local maxima, a fast search approach based on a novel representation by an angle space is employed to determine the best matching between models and the underlying subimage. We find that angular representation in multidimensional search corresponds better to Euclidean distance than conventional projection and yields improved classification of noisy shapes. Experiments are performed in various interfering distortions, and robust detection and segmentation are achieved.
This paper describes computervision algorithms to assist in retinal laser surgery, which is widely used to treat leading blindness causing conditions but only has a 50% success rate, mostly due to a lack of spatial m...
详细信息
ISBN:
(纸本)769501494
This paper describes computervision algorithms to assist in retinal laser surgery, which is widely used to treat leading blindness causing conditions but only has a 50% success rate, mostly due to a lack of spatial mapping and reckoning capabilities in current instruments. The novel technique described here automatically constructs a composite (mosaic) image of the retina from a sequence of incomplete views. This mosaic will be useful to ophthalmologists for both diagnosis and surgery. The new technique goes beyond published methods in both the medical and computervision literatures because it is fully automated, models the patient-dependent curvature of the retina, handles large interframe motions, and does not require calibration. At the heart of the technique is a 12-parameter image transformation model derived by modeling the retina as a quadratic surface and assuming a weak perspective camera, and rigid motion. Estimating the parameters of this transformation model requires robustness to unmatchable image features and mismatches between features caused by large interframe motions. The described estimation technique is a hierarchy of models and methods: the initial match set is pruned based on a 0th order transformation estimated using a similarity-weighted histogram;a 1st order, affine transformation is estimated using the reduced match set and least-median of squares;and the final, 2nd order, 12-parameter transformation is estimated using an M-estimator initialized from the 1st order results. Initial experimental results show the method to be robust and accurate in accounting for the unknown retinal curvature in a fully automatic manner, while preserving image details.
Variations in the projection of objects on a 2D image, e.g., due to occlusion and articulation, lead to edge maps which are noisy, contain gaps and spurious elements, and which are deformed. These variations in turn c...
详细信息
ISBN:
(纸本)769501494
Variations in the projection of objects on a 2D image, e.g., due to occlusion and articulation, lead to edge maps which are noisy, contain gaps and spurious elements, and which are deformed. These variations in turn cause variations in the edge map which are typically regularized by the use of a salient measure for each edge element. The use of edge salience, however, typically faced with two drawbacks. First, salience measures take advantage of boundary continuity, but not of shape continuity, which includes continuity of the interior. Second, while each edge element can only belong to one object boundary, in the computation of salience measures, it often freely contributes to the salience of edges in competing grouping hypotheses as well. We identify both drawbacks with the lack of an explicit intermediate representation between the edge map and grouped object boundaries. We propose that (i) a symmetry map can fully represent the initial edge map so that both boundary and regional continuities can be represented via skeletal/shock continuity;(ii) a re-organization of the edge map in the form of completing gaps, discarding spurious elements, smoothing, and partitioning a contour (grouped set of edge elements) can be represented by transformations on the symmetry map;(iii) the optimal grouping corresponds to the least action path consisting of a sequence of symmetry transforms. The focus of this paper is to define transformations on the symmetry map and illustrate results for them. Specifically, we illustrate how spurious elements can be removed, gaps completed, and parts computed despite significant noise.
We show how to learn a concise, interpretable model of scene activity directly from optical flow. The model represents the principal routes and modes of movement in complex scenes such as pedestrian plazas and traffic...
详细信息
ISBN:
(纸本)0769501494
We show how to learn a concise, interpretable model of scene activity directly from optical flow. The model represents the principal routes and modes of movement in complex scenes such as pedestrian plazas and traffic intersections, and supports a variety of inferences about the observed activities, including annotation, prediction, and anomaly detection. The model takes the form of a novel hidden Markov model generalization that observes a variable number of datapoints per frame (time step). A monotonic entropy-optimizing algorithm determines the parameters and structure of this model, exploiting the duality between learning and compression to produce highly predictive and interpretable models. This approach discovers minimal models of coherent motions and their switching dynamics - without tracking or prior knowledge about the spatial or temporal structure of the scene.
Tomasi and Kanade (1992) introduced the factorization method for recovering 3D structure from 2D video. In their formulation, the 3D shape and 3D motion are computed by using an SVD to approximate a matrix that is ran...
详细信息
Tomasi and Kanade (1992) introduced the factorization method for recovering 3D structure from 2D video. In their formulation, the 3D shape and 3D motion are computed by using an SVD to approximate a matrix that is rank 3 in a noiseless situation. In this paper we reformulate the problem using the fact that the x and y coordinates of each feature are known from their projection onto the image plane in frame 1. We show how to compute the 3D shape, i.e., the relative depths z, and the 3D motion by a simple factorization of a matrix that is rank 1 in a noiseless situation. This allows the use of very fast algorithms even when using a large number of features and large number of frames. We also show how to accommodate confidence weights for the feature trajectories. This is done without additional computational cost by rewriting the problem as the factorization of a modified matrix.
We present an algorithm to solve the sensor planning problem for a trinocular, active vision system. This algorithm uses an iterative optimization method to first solve for the translation between the three cameras an...
详细信息
We present an algorithm to solve the sensor planning problem for a trinocular, active vision system. This algorithm uses an iterative optimization method to first solve for the translation between the three cameras and then uses this result to solve for parameters such as pan, tilt angles of the cameras and zoom setting.
Signatures can be acquired with a camera-based system with enough resolution to perform verification. This paper presents the performance of a visual-acquisition signature verification system, emphasizing on the impor...
详细信息
Signatures can be acquired with a camera-based system with enough resolution to perform verification. This paper presents the performance of a visual-acquisition signature verification system, emphasizing on the importance of the parameterisation of the signature in order to achieve good classification results. A technique to overcome the lack of examples in order to estimate the generalization error of the algorithm is also described.
The task of multicamera surveillance is to reconstruct the paths taken by all moving objects that are temporally visible from multiple non-overlapping cameras. We present a Bayesian formalization of this task, where t...
详细信息
The task of multicamera surveillance is to reconstruct the paths taken by all moving objects that are temporally visible from multiple non-overlapping cameras. We present a Bayesian formalization of this task, where the optimal solution is the set of object paths with the highest posterior probability given the observed data. We show how to efficiently approximate the maximum a posteriori solution by linear programming and present initial experimental results.
This paper addresses the problem of constructing the scale-space aspect graph of a solid of revolution whose surface is the zero set of a polynomial volumetric density undergoing a Gaussian diffusion process. Equation...
详细信息
This paper addresses the problem of constructing the scale-space aspect graph of a solid of revolution whose surface is the zero set of a polynomial volumetric density undergoing a Gaussian diffusion process. Equations for the associated visual event surfaces are derived, and polynomial curve tracing techniques are used to delineate these surfaces. An implementation and examples are presented, and limitations as well as extensions of the proposed approach are discussed.
暂无评论