In this parallel surface rendering algorithm based on dividing cube2 on a SIMD machine MasPar MP-1, we address the problem of load balancing and image composition. We divide a 3D array of Nx x Ny x Nz volume data into...
详细信息
ISBN:
(纸本)0819417572
In this parallel surface rendering algorithm based on dividing cube2 on a SIMD machine MasPar MP-1, we address the problem of load balancing and image composition. We divide a 3D array of Nx x Ny x Nz volume data into Nx x Ny columns, each Nz deep. Each processor in the mesh receives a subvolume of such data columns. All processors synchronously traverse its subvolume to determine the voxels intersecting the isosurface. Intersecting voxels are called isovoxels. Partial load balancing distributes the isovoxels contained in a row of processors evenly among the processors in that row to reduce the network traffic and complexity of the rendering phase. Each isovoxel is subdivided into point primitives using dividing cube algorithm. Rendering algorithm transforms the surface points and their normals and projects them into the view plane.
Recently, some speech recognition methods using fusion of visual and auditory information have been researched. In this paper, a study on the mouth shape image suitable for fusion of visual and auditory information ha...
详细信息
Recently, some speech recognition methods using fusion of visual and auditory information have been researched. In this paper, a study on the mouth shape image suitable for fusion of visual and auditory information has been described. Features of mouth shape which are extracted from gray level image and binary image are adopted, and speech recognition using linear combination method has been performed. From results of speech recognition, the studies on the mouth shape features which are effective in fusion of visual and auditory information have been performed. And the effectiveness of using two kinds of mouth shape features also has been confirmed.
We utilize speech information to improve the quality of audio-visualcommunications such as video telephony and videoconferencing. We show that the marriage of speech analysis and imageprocessing can solve problems r...
详细信息
We utilize speech information to improve the quality of audio-visualcommunications such as video telephony and videoconferencing. We show that the marriage of speech analysis and imageprocessing can solve problems related to lip synchronization. We present a technique called speech-assisted frame-rate conversion, and apply it to coding of talking head video. Demonstration sequences are presented. Extensions and other applications are outlined.
A compression method is presented which combines the distinct advantage of being fixed length with the visual picture quality of transform-based coding methods. This method compresses the Karhunen-Loeve transform coef...
详细信息
A compression method is presented which combines the distinct advantage of being fixed length with the visual picture quality of transform-based coding methods. This method compresses the Karhunen-Loeve transform coefficients of 8/spl times/8 pixel blocks of the picture and attains a compression ratio of 15.37:1 for color images. Compared to the absolute moment block truncation coding, the other known fixed length compression method, our method improves the compression ratio by 68% and significantly improves the visual quality of the reconstructed picture.
This paper investigates the development of a new image quality criterion based on the psychovisual model of tuned channels and more particularly the phenomenon of masking. The masking model parameters have been evalua...
详细信息
This paper investigates the development of a new image quality criterion based on the psychovisual model of tuned channels and more particularly the phenomenon of masking. The masking model parameters have been evaluated by psychovisual tests assuming a logarithmic relationship between the visibility threshold and the contrast value of the background for a given perceptual channel. The masking has the consequence that only a part of the noise of a noisy image is really visible and affects the visual quality of the image. The idea of the proposed criterion is to evaluate the image quality after the cancellation of the masked noise defined as the invisible noise. The criterion has been used in order to compare different coders and different post-processings of noisy images.
Motion estimation is an important part of most video coding schemes because it enables us to exploit the high degree of temporal redundancy present. Though block matching algorithms (BMA) yield coarse and piecewise-co...
详细信息
Motion estimation is an important part of most video coding schemes because it enables us to exploit the high degree of temporal redundancy present. Though block matching algorithms (BMA) yield coarse and piecewise-constant fields, they are very popular due to their simplicity and low bit overhead. In this paper, we propose to use a more advanced gradient-based technique to overcome the disadvantages of BMA. A dense motion field is estimated and compressed using a hierarchical finite element (HFE) representation, leading to an efficient, highly parallel, iterative, multiresolution optimization algorithm. The scheme also uses multiresolution measurements and a coarse-to-fine strategy to estimate large displacements. At comparable bit rates, the motion fields are much smoother and more natural than those produced by BMA. Coding gains of about 0.6 dB were obtained on Claire. More importantly, substantial visual improvements were obtained, mainly due to improved performance near the edges.
We present a modified version of an embedded wavelet coding scheme, first suggested by Shapiro (see IEEE Transactions on Signal processing, vol.41, no. 12, p.3445-3462, 1993), that improves the performance of the orig...
详细信息
We present a modified version of an embedded wavelet coding scheme, first suggested by Shapiro (see IEEE Transactions on Signal processing, vol.41, no. 12, p.3445-3462, 1993), that improves the performance of the original algorithm in a visual subjective distortion sense. We preserve the features of the original Shapiro's embedded coder. It is possible to choose a fixed target bit rate, as the information needed to represent an image coded at some rate always contains the needed information for the same image coded at lower rates. Therefore, the decoder can cease decoding the bit stream at any point, simulating an image coded at a lower rate corresponding to the truncated bit stream. We also introduce some perceptive improvements by adopting different (more regular) filters with respect to the original QMF pyramid filters proposed by Simoncelli, Hingorani et al. (1987) and used by Shapiro. These filters are synthesized using a "wavelet approach" instead of a "subband approach", and this leads to a better control on their regularity properties, jointly with better perceptual performance.
We propose rate conversion method by re-quantization in which MPEG coded video at high bit rate is converted into the MPEG bitstream at a lower bit rate without decoding to obtain the reconstructed picture. The quanti...
详细信息
We propose rate conversion method by re-quantization in which MPEG coded video at high bit rate is converted into the MPEG bitstream at a lower bit rate without decoding to obtain the reconstructed picture. The quantization step required for re-quantization is determined by the local and global quantization steps which are closely related to the activities in the pixel domain. The simulation results show that very similar rate distortion curves to those of transcoding have been obtained, and the difference in SNR is relatively small, about 0.4 dB at 1 Mbit/s in MPEG1 using a master bitstream at 4 Mbit/s, and 1 dB at 3 Mbit/s in MPEG2 using a master at 9 Mbit/s. Since the proposed method is very simple and requires much less hardware implementation cost than transcoding, it has a significant advantage as a rate conversion tool.
In this paper, we present an analog VLSI spatio-frequency analysis based visual feature extraction retina, dedicated to the real-time stereo vision. In this retina, local extreme points in the DoG filtered image are e...
详细信息
In this paper, we present an analog VLSI spatio-frequency analysis based visual feature extraction retina, dedicated to the real-time stereo vision. In this retina, local extreme points in the DoG filtered image are extracted as pertinent visual features. A 128-pixel line-based prototype chip is presented with experiment results. A processing speed of 100k lines/s has been obtained (exposition time excluded), it will be used in an integrated analog stereo vision system for real-time obstacle detection task.
This paper presents a VLSI implementation of discrete wavelet transform (DWT). The architecture is systolic in nature and performs both high-pass and low-pass coefficient calculations with only one set of multipliers,...
详细信息
This paper presents a VLSI implementation of discrete wavelet transform (DWT). The architecture is systolic in nature and performs both high-pass and low-pass coefficient calculations with only one set of multipliers, in contrast to the approaches presented in the literature. The architecture is simple, modular, and cascadable, and has been implemented in VLSI. Experimental results show that real-time coefficient calculation on a 512/spl times/512 monochrome video input can be achieved with 1.2 /spl mu/m technology.
暂无评论