We utilize speech information to improve the quality of audio-visualcommunications such as video telephony and videoconferencing. We show that the marriage of speech analysis and imageprocessing can solve problems r...
详细信息
We utilize speech information to improve the quality of audio-visualcommunications such as video telephony and videoconferencing. We show that the marriage of speech analysis and imageprocessing can solve problems related to lip synchronization. We present a technique called speech-assisted frame-rate conversion, and apply it to coding of talking head video. Demonstration sequences are presented. Extensions and other applications are outlined.
A compression method is presented which combines the distinct advantage of being fixed length with the visual picture quality of transform-based coding methods. This method compresses the Karhunen-Loeve transform coef...
详细信息
A compression method is presented which combines the distinct advantage of being fixed length with the visual picture quality of transform-based coding methods. This method compresses the Karhunen-Loeve transform coefficients of 8/spl times/8 pixel blocks of the picture and attains a compression ratio of 15.37:1 for color images. Compared to the absolute moment block truncation coding, the other known fixed length compression method, our method improves the compression ratio by 68% and significantly improves the visual quality of the reconstructed picture.
This paper investigates the development of a new image quality criterion based on the psychovisual model of tuned channels and more particularly the phenomenon of masking. The masking model parameters have been evalua...
详细信息
This paper investigates the development of a new image quality criterion based on the psychovisual model of tuned channels and more particularly the phenomenon of masking. The masking model parameters have been evaluated by psychovisual tests assuming a logarithmic relationship between the visibility threshold and the contrast value of the background for a given perceptual channel. The masking has the consequence that only a part of the noise of a noisy image is really visible and affects the visual quality of the image. The idea of the proposed criterion is to evaluate the image quality after the cancellation of the masked noise defined as the invisible noise. The criterion has been used in order to compare different coders and different post-processings of noisy images.
Motion estimation is an important part of most video coding schemes because it enables us to exploit the high degree of temporal redundancy present. Though block matching algorithms (BMA) yield coarse and piecewise-co...
详细信息
Motion estimation is an important part of most video coding schemes because it enables us to exploit the high degree of temporal redundancy present. Though block matching algorithms (BMA) yield coarse and piecewise-constant fields, they are very popular due to their simplicity and low bit overhead. In this paper, we propose to use a more advanced gradient-based technique to overcome the disadvantages of BMA. A dense motion field is estimated and compressed using a hierarchical finite element (HFE) representation, leading to an efficient, highly parallel, iterative, multiresolution optimization algorithm. The scheme also uses multiresolution measurements and a coarse-to-fine strategy to estimate large displacements. At comparable bit rates, the motion fields are much smoother and more natural than those produced by BMA. Coding gains of about 0.6 dB were obtained on Claire. More importantly, substantial visual improvements were obtained, mainly due to improved performance near the edges.
We present a modified version of an embedded wavelet coding scheme, first suggested by Shapiro (see IEEE Transactions on Signal processing, vol.41, no. 12, p.3445-3462, 1993), that improves the performance of the orig...
详细信息
We present a modified version of an embedded wavelet coding scheme, first suggested by Shapiro (see IEEE Transactions on Signal processing, vol.41, no. 12, p.3445-3462, 1993), that improves the performance of the original algorithm in a visual subjective distortion sense. We preserve the features of the original Shapiro's embedded coder. It is possible to choose a fixed target bit rate, as the information needed to represent an image coded at some rate always contains the needed information for the same image coded at lower rates. Therefore, the decoder can cease decoding the bit stream at any point, simulating an image coded at a lower rate corresponding to the truncated bit stream. We also introduce some perceptive improvements by adopting different (more regular) filters with respect to the original QMF pyramid filters proposed by Simoncelli, Hingorani et al. (1987) and used by Shapiro. These filters are synthesized using a "wavelet approach" instead of a "subband approach", and this leads to a better control on their regularity properties, jointly with better perceptual performance.
We propose rate conversion method by re-quantization in which MPEG coded video at high bit rate is converted into the MPEG bitstream at a lower bit rate without decoding to obtain the reconstructed picture. The quanti...
详细信息
We propose rate conversion method by re-quantization in which MPEG coded video at high bit rate is converted into the MPEG bitstream at a lower bit rate without decoding to obtain the reconstructed picture. The quantization step required for re-quantization is determined by the local and global quantization steps which are closely related to the activities in the pixel domain. The simulation results show that very similar rate distortion curves to those of transcoding have been obtained, and the difference in SNR is relatively small, about 0.4 dB at 1 Mbit/s in MPEG1 using a master bitstream at 4 Mbit/s, and 1 dB at 3 Mbit/s in MPEG2 using a master at 9 Mbit/s. Since the proposed method is very simple and requires much less hardware implementation cost than transcoding, it has a significant advantage as a rate conversion tool.
In this paper, we present an analog VLSI spatio-frequency analysis based visual feature extraction retina, dedicated to the real-time stereo vision. In this retina, local extreme points in the DoG filtered image are e...
详细信息
In this paper, we present an analog VLSI spatio-frequency analysis based visual feature extraction retina, dedicated to the real-time stereo vision. In this retina, local extreme points in the DoG filtered image are extracted as pertinent visual features. A 128-pixel line-based prototype chip is presented with experiment results. A processing speed of 100k lines/s has been obtained (exposition time excluded), it will be used in an integrated analog stereo vision system for real-time obstacle detection task.
This paper presents a VLSI implementation of discrete wavelet transform (DWT). The architecture is systolic in nature and performs both high-pass and low-pass coefficient calculations with only one set of multipliers,...
详细信息
This paper presents a VLSI implementation of discrete wavelet transform (DWT). The architecture is systolic in nature and performs both high-pass and low-pass coefficient calculations with only one set of multipliers, in contrast to the approaches presented in the literature. The architecture is simple, modular, and cascadable, and has been implemented in VLSI. Experimental results show that real-time coefficient calculation on a 512/spl times/512 monochrome video input can be achieved with 1.2 /spl mu/m technology.
The task of image coding is to improve the efficiency of visual communication channels. This entails minimizing the amount of data required to transmit the information about the radiance field. We assess this task in ...
详细信息
ISBN:
(纸本)081941543X
The task of image coding is to improve the efficiency of visual communication channels. This entails minimizing the amount of data required to transmit the information about the radiance field. We assess this task in the context of visual communication channel design including image gathering, coding, and Wiener restoration which results in channel designs with significantly improved performance. Conventional assessments are limited to the digital transmission channel beginning at the output of the image-gathering device and ending at the input to the image-display device. Our end-to-end assessment, in addition, incorporates these two devices. This assessment combines Shannon's communication theory with Wiener's restoration filter and with the critical design factors of the image gathering and display devices. This provides the metrics needed to quantify and optimize the end-to-end performance of the visual communication channel. The results are described.
The guiding principle of this study is to find an optimum way to simplify the contours produced by a second generation coding scheme based on morphological segmentation. For this purpose, evaluations of existing metho...
详细信息
ISBN:
(纸本)081941638X
The guiding principle of this study is to find an optimum way to simplify the contours produced by a second generation coding scheme based on morphological segmentation. For this purpose, evaluations of existing methods for contour simplification are carried out first. Based on the human visual phenomenon, a new nonlinear filter by means of majority operation is designed to simplify the contours in order to obtain an optimum compromise between the cost for contour coding and visual quality. Applications for region-based still image coding and video coding are demonstrated. Experimental results have shown an average of 20% reduction of bits for contour coding while keeping good visual quality.
暂无评论