Research in very low-bit rate coding has made significant advancements in the past few years. Most recently, the introduction of the MPEG-4 proposal has motivated a wide variety of approaches aimed al achieving a new ...
详细信息
Research in very low-bit rate coding has made significant advancements in the past few years. Most recently, the introduction of the MPEG-4 proposal has motivated a wide variety of approaches aimed al achieving a new level of video compression. In this paper we review progress in VLBV categorized into 3 main areas: (1) Waveform coding, (2) 2D Content-basedcoding, and (3) model-based coding. Where appropriate we also described proposals to the MPEG-4 committee in each of these areas.
This paper presents an overview of research activities in Japan in the field of very low bit-rate video coding. Related research based on the concept of ''intelligent image coding'' started in the mid-...
详细信息
This paper presents an overview of research activities in Japan in the field of very low bit-rate video coding. Related research based on the concept of ''intelligent image coding'' started in the mid-1980's. Although this concept originated from the consideration of a new type of image coding, it can also be applied to other interesting applications such as human interface and psychology. On the other hand, since the beginning of the 1990's, research on the improvement of waveform coding has been actively performed to realize very low bit-rate video coding. Key techniques employed here are improvement of motion compensation and adoption of region segmentation. In addition to the above, we propose new concepts of image coding, which have the potential to open up new aspects of image coding, e.g. ideas of interactive image coding, integrated 3-D visual communication and coding of multimedia information considering mutual relationship amongst various media.
We present a novel and practical way to integrate techniques from computer vision to low bit-rate coding systems for video teleconferencing applications. Our focus is to locate and track the faces and selected facial ...
详细信息
We present a novel and practical way to integrate techniques from computer vision to low bit-rate coding systems for video teleconferencing applications. Our focus is to locate and track the faces and selected facial features of persons in typical head-and-shoulders video sequences, and to exploit the location information in a 'classical' video coding/decoding system. The motivation is to enable the system to encode selectively various image areas and to produce perceptually pleasing coded images where faces are sharper. We refer to this approach-a mix of classical waveform coding and model-based coding-as model-assisted coding. We propose two totally automatic algorithms which, respectively, perform the detection of a head outline, and identify an 'eyes-nose-mouth' region, both from downsampled binary thresholded edge images. The algorithms operate accurately and robustly, even in cases of significant head rotation or partial occlusion by moving objects. We show how the information about face and facial feature location can be advantageously exploited by low bit-rate waveform-based video coders. In particular, we describe a method of object-selective quantizer control in a standard coding system based on motion-compensated discrete cosine transform-CCITT's recommendation H.261. The approach is based on two novel algorithms, namely buffer rate modulation and buffer size modulation. By forcing the rate control algorithm to transfer a fraction of the total available bit-rate from the coding of the non-facial to that of the facial area, the coder produces images with better-rendered facial features, i.e. coding artefacts in the facial area are less pronounced and eye contact is preserved. The improvement was found to be perceptually significant on video sequences coded at the ISDN rate of 64 kbps, with 48 kbps for the input (color) video signal in QCIF format.
We present a novel and practical way to integrate techniques from computer vision to low bit-rate coding systems for video teleconferencing applications. Our focus is to locate and track the faces of persons in typica...
详细信息
We present a novel and practical way to integrate techniques from computer vision to low bit-rate coding systems for video teleconferencing applications. Our focus is to locate and track the faces of persons in typical head-and-shoulders video sequences, and to exploit the face location information in a 'classical' video coding/decoding system, The motivation is to enable the system to selectively encode various image areas and to produce psychologically pleasing coded images where faces are sharper, We refer to this approach as model-assisted coding. We propose a totally automatic, low-complexity algorithm, which robustly performs face detection and tracking. A priori assumptions regarding sequence content are minimal and the algorithm operates accurately even in cases of partial occlusion by moving objects. Face location information is exploited by a low bit-rate 3D subband-based video coder which uses both a novel model-assisted pixel-based motion compensation scheme, as well as model-assisted dynamic bit allocation with object-selective quantization. By transferring a small fraction of the total available bit-rate from the non-facial to the facial area, the coder produces images with better-rendered facial features. The improvement was found to be perceptually significant on video sequences coded at 96 kbps for an input luminance signal in CIF format, The technique is applicable to any video coding scheme that allows for fine-grain quantizer selection (e.g. MPEG, H.261), and can maintain full decoder compatibility.
Known coding techniques for transmitting moving images at very low bit rates are explained by the source models on which these coding techniques are based. It is shown that with motion-compensated hybrid coding, objec...
详细信息
Known coding techniques for transmitting moving images at very low bit rates are explained by the source models on which these coding techniques are based. It is shown that with motion-compensated hybrid coding, object-based analysis-synthesis coding, knowledge-basedcoding and semantic coding, there is a consistent development of source models. In consequence these coding techniques can be combined in a layered coding system. From experimental results obtained for object-based analysis-synthesis, coding estimates for the coding efficiency of such a layered coding system are derived using head and shoulder video telephone test sequences. It is shown that an additional compression factor of about 3 can be expected with such a complex layered coding system, when compared to block-based hybrid coding.
We describe an approach for modelling a person's face for model-based coding. The goal is to estimate the 3D shape by combining the contour analysis and shading analysis of the human face image in order to increas...
详细信息
We describe an approach for modelling a person's face for model-based coding. The goal is to estimate the 3D shape by combining the contour analysis and shading analysis of the human face image in order to increase the quality of the estimated 3D shape. The motivation for combining contour and shading cues comes from the observation that the shading cue leads to severe errors near the occluding boundary, while the occluding contour cue provides incomplete surface information in regions away from contours. Towards this, we use the deformable model as the common level of integration such that a higher-quality measurement will dominate the depth estimate. The feasibility of our approach is demonstrated using a real facial image.
This paper describes a novel intelligent low bit rate coding technique for motion pictures, named Video Sequence Quantizer (VSQ). VSQ is one of the semantic coding. The concept is very simple. The encoder extracts mot...
详细信息
ISBN:
(纸本)081941638X
This paper describes a novel intelligent low bit rate coding technique for motion pictures, named Video Sequence Quantizer (VSQ). VSQ is one of the semantic coding. The concept is very simple. The encoder extracts motion information about target objects and transmit it as a few control parameters. The decoder has several video sequences in its database and outputs them selectively according to the transmitted parameters. It can reproduce natural movements with a simple operation compared with another semantic coding scheme called computer graphics model-based coding. We make clear the concept of VSQ and also apply it to a very low bit rate TV phone system. Computer simulation of TV phone has been done using twelve video sequences. It can naturally reproduce the speaker's movements by 80 bit/sec. VSQ is a very simple but basic concept for video coding. It will be also useful for mobile and multimedia communications.
Recently, studies aiming at the next generation of visual communication services which support better human communication have been carried out intensively in Japan. The principal motive of these studies is to develop...
详细信息
Recently, studies aiming at the next generation of visual communication services which support better human communication have been carried out intensively in Japan. The principal motive of these studies is to develop new services which are not restricted to a conventional communication framework based on the transmission of waveform signals. This paper focuses on three important key words in these studies;''intelligent,'' ''real,'' and ''distributed and collaborative,'' and describes recent research activities. The first key word ''intelligent'' relates to intelligent image coding. As a particular example, model-based coding of moving facial images is discussed in detail. In this method, shape change and motion of the human face is described by a small number of parameters. This feature leads to the development of new applications such as very low bit-rate transmission of moving facial images, analysis and synthesis of facial expression, human interfaces, and so on. The second key word ''real'' relates to communication with realistic sensations and virtual space teleconferencing. Among various component technologies, real-time reproduction of 3-D human images and a cooperative work environment with virtual space are discussed in detail. The last key word ''distributed and collaborative'' relates to collaborative work in a distributed work environment. The importance of visual media in collaborative work, a concept of CSCW, and requirements for realizing a distributed collaborative environment are discussed. Then, four examples of CSCW systems are briefly outlined.
暂无评论