A number of object-orientedcoding algorithms have been proposed for codingvideo sequences at low bit rates. Instead of estimating motion of pixel blocks, these algorithms segment each image into regions of uniform m...
详细信息
ISBN:
(纸本)0819424358
A number of object-orientedcoding algorithms have been proposed for codingvideo sequences at low bit rates. Instead of estimating motion of pixel blocks, these algorithms segment each image into regions of uniform motion and estimate the motion of these regions. Estimating the segmentation and computing motion parameters are evidently closely related. Most algorithms iteratively compute complex motion parameters and segmentation estimates, and typically computationally intensive. Image intensity segmentations were also used instead of motion field segmentations based on the hypothesis that adjacent pixels with similar luminance values are part of the same object, and therefore share common motion parameters. We previously proposed a simple two-stage algorithm for which 1) a translational block-motion field is used to compute a translational motion field and its segmentation, 2) an optical flow field is then used to compute affine motion parameters for each segmented region. In this paper, we propose to replace the translational block motion field by another translational motion field which assigns a motion vector to each region of an image intensity segmentation. This approach combines the advantages of both intensity and motion field segmentations, generates motion field segmentations that matches the scene content more closely with 15% - 25% fewer objects, and therefore reduces the side bit rate required for coding the motion field segmentation.
作者:
LAVAGETTO, FCURINGA, SDIST
Department of Communication Computer and Systems Science University of Genova Via Opera Pia 11a Genova I-16145 Italy
This paper describes a new approach to very low bit-rate interpersonal visual communication based on suitable scene model, i.e. a flexible structure adapted to the specific characteristics of the speaker's face. T...
详细信息
This paper describes a new approach to very low bit-rate interpersonal visual communication based on suitable scene model, i.e. a flexible structure adapted to the specific characteristics of the speaker's face. The face model is dynamically adapted to time-varying facial expressions by means of few parameters, estimated from the analysis of the real image sequence, which are used to apply knowledge-based deformation rules on a simplified muscle structure. Facial muscles are distributed in correspondence to the primary facial features and can be activated through the direct stimulation of each individual fiber or, indirectly, by interaction with adjacent stimulated fibers. The analysis algorithms performed of the transmitter to estimate the model parameters are based on feature-oriented operators aimed at segmenting the real incoming frames and at the extraction of the primary facial descriptors. The analysis/synthesis algorithms have been developed on a Silicon Graphics workstation and have been tested on various 'head-and-shoulder' sequences: the obtained results are very promising for applications both in videophone coding and in picture animation, where the facial expressions of a synthetic actor is reproduced according to the parameters extracted from a real speaking face.
A novel motion-based object-oriented codec for video transmission at very low bit-rates is proposed. The object motion is modeled by quadratic transform with coefficients estimated via a nonlinear quasi-Newton method....
详细信息
A novel motion-based object-oriented codec for video transmission at very low bit-rates is proposed. The object motion is modeled by quadratic transform with coefficients estimated via a nonlinear quasi-Newton method. The segmentation problem is put forward as a constrained optimization problem which interacts with the motion estimation process in the course of region growing. A context-based shape coding method which takes into account the image synthesis error as well as the geometric distortion, is also proposed. Quantitative and subjective performance results of the codec on various test sequences illustrate the superior performance of the method. (C) 2000 Elsevier Science B.V. All rights reserved.
Envisioned advanced multimedia video services include arbitrarily shaped (AS) image segments as well as regular rectangular images. Image segments of the TV weather report produced by the chromo-key technique [1] and ...
详细信息
Envisioned advanced multimedia video services include arbitrarily shaped (AS) image segments as well as regular rectangular images. Image segments of the TV weather report produced by the chromo-key technique [1] and image segments produced by video analysis and image segmentation [2-4] are typical examples of AS image segments. This paper explores efficient intraframe transform coding techniques for general two-dimensional (2D) AS image segments, treating the traditional rectangular images as a special case. In particular, we focus on the transform coding of the partially defined image blocks along the boundary of the AS image segments. We recognize two different approaches - the brute force transform coding approach and the shape-adaptive transform coding approach. The former fills the uncovered area with the optimal redundant data such that the resulting transform spectrum is compact. A simple but efficient mirror image extension technique is proposed. Once augmented into full image blocks, these boundary blocks can be processed by traditional block-based transform techniques like the popular discrete cosine transform (DCT). In the second approach, we change either the transform basis or the coefficient calculation process adaptively based on the shape of the AS image segment. We propose an efficient shape-projected problem formulation to reduce the dimension of the problem. Existing coding algorithms, such as the orthogonal transform by Gilge [5] and the iterative coding by Kaup and Aach [6], can be interpreted intuitively. We also propose a new adaptive transform based on the same principle as that used in deriving the DCT from the optimal Karhunen-Loeve transform (KLT). We analyze the tradeoff relationship between compression performance, computational complexity, and codec complexity for different coding schemes. Simulation results show that complicated algorithms (e.g., iterative, adaptive) can improve the quality by 5-10 dB at some computational or hardware
暂无评论