Capturing real motions from video sequences is a powerful method for automatic building of facial articulation models. In this paper, we propose an explanation-based facial motion tracking algorithm based on a piecewi...
详细信息
Capturing real motions from video sequences is a powerful method for automatic building of facial articulation models. In this paper, we propose an explanation-based facial motion tracking algorithm based on a piecewise Bezier volume deformation model (PBVD). The PBVD is a suitable model both for the synthesis and the analysis of facial images. It is linear and independent of the facial mesh structure. With this model, basic facial movements, or action units, are interactively defined. By changing the magnitudes of these action units, animated facial images are generated. The magnitudes of these action units can also be computed from real video sequences using a model-based tracking algorithm. However, in order to customize the articulation model for a particular face, the predefined PBVD action units need to be adaptively modified. In this paper, we first briefly introduce the PBVD model and its application in facial animation. Then a multiresolution PBVD-based motion tracking algorithm is presented. Finally, we describe an explanation-based tracking algorithm that takes the predefined action units as the initial articulation model and adaptively improves them during the tracking process to obtain a more realistic articulation model. Experimental results on PBVD-based animation, model-based tracking, and explanation-based tracking are shown in this paper.
We treat the problem of edge detection as one of statistical inference. Local edge cues, implemented by filters, provide information about the likely positions of edges which can be used as input to higher-level model...
详细信息
We treat the problem of edge detection as one of statistical inference. Local edge cues, implemented by filters, provide information about the likely positions of edges which can be used as input to higher-level models. Different edge cues can be evaluated by the statistical effectiveness of their corresponding filters evaluated on a dataset of 100 presegmented images. We use information theoretic measures to determine the effectiveness of a variety of different edge detectors working at multiple scales on black and white and color images. Our results give quantitative measures for the advantages of multi-level processing. for the use of chromaticity in addition to greyscale, and for the relative effectiveness of different detectors.
This paper presents a framework for integrating multiple sensory data, sparse range data and dense depth maps from shape from shading in order to improve the 3D reconstruction of visible surfaces of 3D objects. The in...
详细信息
This paper presents a framework for integrating multiple sensory data, sparse range data and dense depth maps from shape from shading in order to improve the 3D reconstruction of visible surfaces of 3D objects. The integration process is based on propagating the error difference between the two data sets by fitting a surface to that difference and using it to correct the visible surface obtained from shape from shading. A feedforward neural network is used to fit a surface to the sparse data. We also study the use of the extended Kalman filter for supervised learning and compare it with the backpropagation algorithm. A performance analysis is done to obtain the best neural network architecture and learning algorithm. It is found that the integration of sparse depth measurements has greatly enhanced the 3D visible surface obtained from shape from shading in terms of metric measurements.
For 3D reconstruction, polynocular stereo based on multiple image fusion is a promising method. We developed a convolver-based nine-eye stereo machine called SAZAN. It performs real-time acquisition of dense depth map...
详细信息
For 3D reconstruction, polynocular stereo based on multiple image fusion is a promising method. We developed a convolver-based nine-eye stereo machine called SAZAN. It performs real-time acquisition of dense depth map at 20 MDPS (Million Depth-pixels Per Second). The reduction of matching ambiguities, which is the most crucial part in stereo matching, is effectively performed by filtering operations of 2D convolver LSI. Several new ideas and capabilities including a nonlinear data reduction of LoG outputs, an efficient geometric calibration and a subpixel disparity are also implemented in hardware. Considering the hardware size and the various factors that have an influence on the final processing.quality, the computational performance is compared with existing stereo systems including the CMU stereo machine.
Background estimation and removal based on the joint use of range and color data produces superior results than can be achieved with either data source alone. This is increasingly relevant as inexpensive, real-time, p...
详细信息
Background estimation and removal based on the joint use of range and color data produces superior results than can be achieved with either data source alone. This is increasingly relevant as inexpensive, real-time, passive range systems become more accessible through novel hardware and increased CPU processing.speeds. Range is a powerful signal for segmentation which is largely independent of color and hence not effected by the classic color segmentation problems of shadows and objects with color similar to the background. However range alone is also not sufficient for the good segmentation: depth measurements are rarely available at all pixels in the scene, and foreground objects may be indistinguishable in depth when they are close to the background. Color segmentation is complementary in these cases. Surprisingly, little work has been done to date on joint range and color segmentation. We describe and demonstrate a background estimation method based on a multidimensional (range and color) clustering at each image pixel. Segmentation of the foreground in a given frame is performed via comparison with background statistics in range and normalized color. Important implementation issues such as treatment of shadows and low confidence measurements are discussed in detail.
This work considers the problem of discovering areas of convergence of line-like shapes in an image. The motivating application is to use the convergence of the blood vessel network to automatically locate the optic n...
详细信息
ISBN:
(纸本)0818684976
This work considers the problem of discovering areas of convergence of line-like shapes in an image. The motivating application is to use the convergence of the blood vessel network to automatically locate the optic nerve in an ocular fundus image. A fuzzy segment model is proposed, based on a conjecture that line-like shapes only contribute to a perception of convergence in their near neighborhood. Using this model, a voting-type method is described to compute a convergence image, which can be searched for one absolute, or one or more relative, strongest points of convergence. Results are presented for twenty ocular fundus images, with a 65% success rate for finding the optic nerve.
Two important problems in camera control are how to keep a moving camera fixated on a target point: and how to precisely aim a camera, whose approximate pose is known, towards a given 3D position. This paper describes...
详细信息
ISBN:
(纸本)0818684976
Two important problems in camera control are how to keep a moving camera fixated on a target point: and how to precisely aim a camera, whose approximate pose is known, towards a given 3D position. This paper describes how electronic image alignment techniques can, be used to solve these problems, as well as provide other benefits such as stabilized video. Hence, stabilized fixated imagery is obtained despite large latencies in the control loop, even Sor simple control strategies. These techniques have been tested using an airborne camera and real-time affine image alignment.
We offer a novel strategy to adapt the perceptual organization process to an object and its context in a scene. Given a set of training images of an object in context, a learning process decides on the relative import...
详细信息
ISBN:
(纸本)0818684976
We offer a novel strategy to adapt the perceptual organization process to an object and its context in a scene. Given a set of training images of an object in context, a learning process decides on the relative importance of the basic Gestalt relationships such as proximity, parallelness, similarity. symmetry, closure, and common region towards segregating the object from the background. This learning is accomplished using a team of stochastic automata in a N-player cooperative game framework. The grouping process which is based on graph partitioning is able to form large groups from relationships defined over a small set of primitives and is fast. We demonstrate the robust performance of the grouping system on a variety of real images. Among the interesting conclusions is the significant role of photometric attributes in grouping and the ability to perform figure-ground segmentation from a set of local relations, each defined over a small number of primitives.
Current systems for content filtering, browsing, and retrieval rely on low-level image descriptors which are unintuitive for most users. In this paper, we propose an alternative framework that exploits the structured ...
详细信息
Current systems for content filtering, browsing, and retrieval rely on low-level image descriptors which are unintuitive for most users. In this paper, we propose an alternative framework that exploits the structured nature of most content sources to achieve semantic content characterization, and lead to much more meaningful user interaction. Computationally, this framework is based on the principles of Bayesian inference and can be implemented efficiently with Bayesian networks. As an illustration of its potential we apply it to the domain of movie databases.
The problem considered in this paper is that of estimating the projective transformation between two images in situations where the image motion is large and feature-matching is not aided by a proximity heuristic. The...
详细信息
ISBN:
(纸本)0818684976
The problem considered in this paper is that of estimating the projective transformation between two images in situations where the image motion is large and feature-matching is not aided by a proximity heuristic. The overall algorithm designed is based on a multiresolution, multihypothesis scheme, and similarities between tracking and matching through multiple resolution levels are exploited. Two major tools are developed in this paper: (i) a Bayesian framework for incorporating similarity measures of feature correspondences in regression to specify the different levels a confidence in the correspondences;and (ii) a Bayesian version of RANSAC, which is able to utilise prior estimates and matching probabilities. The algorithm is tested on a number of real images with large image motion and promising results were obtained.
暂无评论