model-based image coding has been given extensive attention due to its high subject image quality and low bit-rates. But the estimation of object motion parameter is still a difficult problem, and there is not a prope...
详细信息
ISBN:
(纸本)0819435929
model-based image coding has been given extensive attention due to its high subject image quality and low bit-rates. But the estimation of object motion parameter is still a difficult problem, and there is not a proper error criteria for the quality assessment that are consistent with visual properties. This paper presents an algorithm of the facial motion parameter estimation based on feature point correspondence and gives the motion parameter error criteria. The facial motion model comprises of three parts. The first part is the global 3-D rigid motion of the head, the second part is non-rigid translation motion in jaw area, and the third part consists of local non-rigid expression motion in eyes and mouth areas. The feature points are automatically selected by a function of edges, brightness and end-node outside the blocks of eyes and mouth. The numbers of feature point are adjusted adaptively. The jaw translation motion is tracked by the changes of the feature point position of jaw. The areas of non-rigid expression motion can be rebuilt by using block-pasting method. The estimation approach of motion parameter error based on the quality of reconstructed image is suggested. and area error function and the error function of contour transition-turn rate are used to be quality criteria. The criteria reflect the image geometric distortion caused by the error of estimated motion parameters properly.
model-based image coding is a well-known solution for image communication at very low bit-rate. But very complex techniques and large amount of computation are involved in these systems. It is especially difficult to ...
详细信息
ISBN:
(纸本)0819431249
model-based image coding is a well-known solution for image communication at very low bit-rate. But very complex techniques and large amount of computation are involved in these systems. It is especially difficult to automatically extract Facial Definition Parameters (FDPs) and Facial Animation Parameters (FAPs), which are defined in MPEG-4, from 2D image to represent 3D moving objects. In this paper, an algorithm using intra- and inter-frame information to estimate feature parameters is proposed. It utilizes spatial information (edge information) as well as temporal difference between successive frames. The combination using of 2 kinds of information makes the system more robust. Physiological symmetry and proportion is another kind of knowledge used here to make the system to less computational intenseness.
A three-dimensional muscle-based facial expression synthesizer is proposed. The proposed synthesizer will compute the contraction of 19 muscles and rotation of the jaw from 22 feature points estimated by the analyzer,...
详细信息
A three-dimensional muscle-based facial expression synthesizer is proposed. The proposed synthesizer will compute the contraction of 19 muscles and rotation of the jaw from 22 feature points estimated by the analyzer, then apply the muscle contraction model to modify the 3D head model. The head motion, including translation and orientation, is also derived from a set of three feature points. Currently the analyzer is implemented on a PC with a camcorder with near real-time performance, and the synthesizer receives the feature points from the analyzer and synthesizes facial expressions on an SGI Indigo in real time.
Lightness algorithms, which have been proposed as a model for human vision, are aimed at recovering surface reflectance in close approximation, They attempt to separate reflectance data from illumination data by thres...
详细信息
Lightness algorithms, which have been proposed as a model for human vision, are aimed at recovering surface reflectance in close approximation, They attempt to separate reflectance data from illumination data by thresholding a spatial derivative of image intensity, This, however, works only reliable in a world of plane Mondrians. An extension of the classical lightness approach of Land and McCann to curved surfaces is presented in this paper, Assuming smooth surfaces with Lambertian reflection properties and leaving aside occlusions and cast shadows, the separation of those components of the intensity gradient due to reflectance from those due to irradiance is posed as a constraint minimization problem, To do so, two classification operators were introduced which identify potential reflectance and irradiance data using a scale-space filtering approach, Two exemplary applications of the proposed extended lightness algorithm in the field of visual telecommunications are presented: i) the simulation of more uniformly illuminated videophone portrait scenes to give dynamic range compressed images with a most realistic appearance and ii) the synthesis of videophone portrait images from model-based coded data with a correct illumination effect, In both applications, the extended lightness algorithm is employed for estimating the reflectance functions at facial surfaces, Results obtained by applying the extended lightness algorithm are compared with results obtained by conventional methods known from literature.
This paper presents a simple color segmentation technique which could be used in the model-based very low bit-rate coding approaches for videophone applications, in which the delimitation of the face of speaker is req...
详细信息
ISBN:
(纸本)0819424358
This paper presents a simple color segmentation technique which could be used in the model-based very low bit-rate coding approaches for videophone applications, in which the delimitation of the face of speaker is request. This work attempts to segment the face of speaker using color cues. To better take the advantage of the color contents of images, the color segmentation is carried out in HSI (Hue, Saturation, Intensity) space with the three components used in two steps. The original image is first splitted into two groups of regions, one has higher saturation values and other has lower saturation values, by using an adaptive threshold value applied to the histogram of saturation. In the high saturation regions, the hue component can furnish useful references for further segmentation, while in the low saturation regions the intensity component can play the similar role. For each group of regions, a multi-thresholding technique based on either hue or intensity component is then proposed for the subsequent segmentation. After both groups of regions are segmented, a combination of these two segmentation results will provide the finally segmented image. Some experiments with images taken from typical ''head-and-shoulders'' videophone sequences are carried out and some results are presented.
A new scheme based on priori face knowledge and shift template method for fast face feature points extraction are presented in this paper. A fairly good accuracy and speed in detecting the feature points of eyebrows, ...
详细信息
ISBN:
(纸本)081941638X
A new scheme based on priori face knowledge and shift template method for fast face feature points extraction are presented in this paper. A fairly good accuracy and speed in detecting the feature points of eyebrows, eyes, nose and mouth have been achieved by a pair of complementary templates. It is shown by computer simulation that the scheme is very suitable for very low bit rate model-based image coding in real time applications.
This paper addresses a method of extracting human facial features from the head-and-shoulder images used in videophone communications. A basic idea behind the scheme is to make full use of the information in both the ...
详细信息
This paper addresses a method of extracting human facial features from the head-and-shoulder images used in videophone communications. A basic idea behind the scheme is to make full use of the information in both the temporal and spatial domains to detect facial features. The new approach to facial feature extraction consists of a temporal and a spatial eye-mouth finder (EMF). The temporal technique eliminates the influence of background and gives a rough and reliable position estimation of the interesting objects, e.g. eyes and mouth, and the latter reaches an accurate result which is needed in videophone communications. Another characteristic of this method is its simplicity. Some basic operators can be used twice, i.e., they are used not only in the temporal domain but also in the spatial domain. The results on the test image sequences are satisfactory.
The construction of an accurate 3-D scene model is a fundamental aspect of any model-based image coding scheme. This contribution describes the generation of a triangular facet surface representation from the data acq...
详细信息
The construction of an accurate 3-D scene model is a fundamental aspect of any model-based image coding scheme. This contribution describes the generation of a triangular facet surface representation from the data acquired by a calibrated binocular (stereo) camera system.
This paper addresses the issue of 3-D motion estimation in model-based facial imagecoding. A new approach to estimating the motion of the head and the facial expressions is presented and has the following characteris...
详细信息
This paper addresses the issue of 3-D motion estimation in model-based facial imagecoding. A new approach to estimating the motion of the head and the facial expressions is presented and has the following characteristics: 1) An affine nonrigid motion model is set up. The specific knowledge about facial shape and facial expression is formulated by this model in the form of parameters. This affine motion model is especially suitable to such a type of nonrigid motion as facial expressions. 2) based on the affine model, we present a direct method of estimating the two-view motion parameters. Because this method neither necessitates solving the correspondence problem nor computing optical flow, motion parameters can be simply and reliably recovered. 3) based on the reasonable assumption that the 3-D motion of the face is almost smooth in the time domain, we propose several approaches to predicting the motion of the next frame. In this way, the temporal motion information existing in the image sequence is fully exploited. With a good motion predictor the error arising from the treatment of motion by a linear method will be reduced. 4) Using a 3-D model, the new approach is characterized by a feedback loop connecting computer vision and computer graphics. Embedding the synthesis techniques into the analysis phase greatly improves the performance of motion estimation. Our simulations and experiments with long image sequences of real-world scenes indicate that the method developed in this paper not only greatly reduces computational complexity but also substantially improves estimation accuracy. The synthesized image sequence using the estimated motion parameters, a 3-D model of the face, and a frame of textured image looks very natural.
A method for the adaptation of a generic 3-D face model to an actual face in a head-and-shoulders scene is discussed, with application to video-telephony. The adaptation is carried out both on a global scale to reposi...
详细信息
A method for the adaptation of a generic 3-D face model to an actual face in a head-and-shoulders scene is discussed, with application to video-telephony. The adaptation is carried out both on a global scale to reposition and resize the wire-frame, as well as on a local scale to mimic individual physiognomy. To this effect a hierarchical scheme is developed to extract the semantic features in the head-and-shoulders scene, such as silhouette, face, eyes and mouth, using a knowledge-based selection mechanism. These algorithms, which are to be an integral part of a general model-basedimage coder, are tested on typical videophone sequences.
暂无评论