We introduce an efficient algorithm for real-time compression of temporally consistent dynamic 3D meshes. The algorithm uses mesh connectivity to determine the order of compression of vertex locations within a frame. ...
详细信息
ISBN:
(纸本)9781424404810
We introduce an efficient algorithm for real-time compression of temporally consistent dynamic 3D meshes. The algorithm uses mesh connectivity to determine the order of compression of vertex locations within a frame. Compression is performed in a frame to frame fashion using only the last decoded frame and the partly decoded current frame for prediction. Following the predictivecoding paradigm, local temporal and local spatial dependencies between vertex locations are exploited. In this framework we present a novel angle preserving predictor and evaluate its performance against other state of the art predictors. It is shown that the proposed algorithm improves up to 25% upon the current state of the art for compression of temporally consistent dynamic 3D meshes.
In this paper, a method for lossless and near lossless compression of large digital mammograms is proposed. This method is based on a predictive coder that uses integer-to-integer operations, and generates a two-layer...
详细信息
ISBN:
(纸本)9781424404810
In this paper, a method for lossless and near lossless compression of large digital mammograms is proposed. This method is based on a predictive coder that uses integer-to-integer operations, and generates a two-layer embedded bit stream (i.e. near-lossless and refinement layer). The maximum tolerated pixel distortion is guaranteed in the near lossless component. Comparisons with other proposed approaches are based on a database of high-resolution 12 bits/pixel digital mammograms. The simulation results indicate that our lossless compression method can offer average bit rates 39, 5%, 24%, 7% and 2, 5% better than PNG, JPEG 2000, JPEG-lossless, and LOCO, respectively. Besides, for the same images, the proposed method offers near lossless compression at relatively low computational cost, providing an average bit rate of 2, 57 bits/pixel and PSNR of 47,77dB.
In this work, we propose a new technique to model the fixed codebook in low-rate code-excited linear prediction coders. The multi-track codebook presented in this paper reduces the coder bit-rate while maintaining a h...
详细信息
ISBN:
(纸本)9781424404964
In this work, we propose a new technique to model the fixed codebook in low-rate code-excited linear prediction coders. The multi-track codebook presented in this paper reduces the coder bit-rate while maintaining a high speech quality. Unlike the algebraic codebook that is adopted in the 8 kb/s G.729 standard, the multi-track representation uses a pulse density of five pulses per 10 ms speech frame, and permits various pulse magnitudes. Four tracks of disjoint pulse positions sets are constructed and evaluated for each analysis frame. However, only the pulses of the optimal track are quantized and transmitted to the receiver. While the search of the best individual pulse positions is limited to one segment interval of 2 ms, the pulse magnitudes are optimized for the whole analysis frame. The excitation signal in the multi-track codebook is encoded with 2.2 kb/s. Informal test listening and objective test measures have revealed that at the same bit allocation, the multi-track codebook results in significantly higher objective and subjective coded speech quality than the algebraic codebook.
Recent developments in the compression of dynamic meshes or mesh sequences have shown that the statistical dependencies within a mesh sequence can be exploited well by predictivecoding approaches. Coders introduced s...
详细信息
ISBN:
(纸本)9781424404810
Recent developments in the compression of dynamic meshes or mesh sequences have shown that the statistical dependencies within a mesh sequence can be exploited well by predictivecoding approaches. Coders introduced so far use experimentally determined or heuristic thresholds for tuning the algorithms. In video coding rate-distortion (RD) optimization is often used to avoid fixing of thresholds and to select a coding mode. We applied these ideas and present here an RD-optimized mesh coder. It includes different prediction modes as well as an RD cost computation that controls the mode selection across all possible spatial partitions of a mesh to find the clustering structure together with the associated prediction modes. The structure of the RD-optimized D3DMC coder is presented, followed by comparative results with mesh sequences at different resolutions.
Pattern Recognition and Audio Processing are important aspects in the control and behavior of mobile robots. Mobile robot's action depends on the recognition of visual and audio stimuli in order to reflect intelli...
详细信息
Pattern Recognition and Audio Processing are important aspects in the control and behavior of mobile robots. Mobile robot's action depends on the recognition of visual and audio stimuli in order to reflect intelligent behavior of the robot. This work presents two recognition systems developed using morphological operations and linear predictive coding (LPC) with Backpropagation Neural Networks (BNN) to process visual and audio data respectively. The objective is to design a person tracking system for a mobile robot with text dependent pitch recognition and a visual pattern recognition mechanism. The BNN will awake the robot from the idle position, while the visual stimulation will be used to track the person given the command.
With reducing computational complexity, an approximated correlation matrix of the vocal impulse response is proposed in algebraic-code-excited linear prediction (ACELP) coders. By exploring statistical characteristics...
详细信息
With reducing computational complexity, an approximated correlation matrix of the vocal impulse response is proposed in algebraic-code-excited linear prediction (ACELP) coders. By exploring statistical characteristics, we only need to calculate a small portion of correlation coefficients before ACELP search procedure. If we further combine a pulse position prediction algorithm, we can reduce the arithmetic complexity in pre-computing autocorrelation matrix and the number of pulse position combinations with imperceptible degradation in speech quality performance. The proposed scheme can be applied to all ACELP coders such as ITU G.729 and G.723.1
The presence of a background noise in the helicopter's sound signals can cause un-acceptance degradation in the detection of the performance method. The recent detection method uses artificial neural networks (ANN...
详细信息
The presence of a background noise in the helicopter's sound signals can cause un-acceptance degradation in the detection of the performance method. The recent detection method uses artificial neural networks (ANNs), in combination with parametric spectral representation techniques to detect these targets. The feature encoding techniques based on linear prediction coefficients (LPC) have been applied to obtain spectral estimate of the acoustic signals. This paper investigates preprocessing of signal enhancement prior to feature extraction using different wavelet transform functions
In speech recognition, LPC cepstrum based on LPC or MFCC based on Mel-frequency filter bank are widely used as a feature extraction that determines the performance. However, these are not being regarded as the best fe...
详细信息
In speech recognition, LPC cepstrum based on LPC or MFCC based on Mel-frequency filter bank are widely used as a feature extraction that determines the performance. However, these are not being regarded as the best feature extraction. In this paper, we introduce a complex speech analysis for an analytic speech signal to HMM speech recognition. A complex speech analysis can estimate more accurate speech spectrum in low frequencies, as a result, it is expected that the speech analysis can perform well as a feature extractor in speech recognition. The MMSE-based time-varying complex AR speech analysis is adopted and the estimated complex parameters are converted to LPCCs and MFCCs as a feature vector for HTK (HMM tool kit) in order to realize the HMM speech recognition. Through continuous speech recognition experiments with the converted LPCCs and MFCCs, it was found that the complex speech analysis method would not perform well than the real one
In order to deliver real time, high quality voice services in packet based voice system (e.g. voice over Internet protocol, VoIP) system designers must tackle inherent quality problems related to possible packet loss....
详细信息
In order to deliver real time, high quality voice services in packet based voice system (e.g. voice over Internet protocol, VoIP) system designers must tackle inherent quality problems related to possible packet loss. To combat the inevitable speech quality deterioration resulting from the loss of transmitted packets of speech information, techniques that provide estimates of the lost information that is needed by the speech recovery process are of considerable interest. Furthermore, in VoIP systems employing linear predictive coding (LPC) based speech coders, a significant percentage of the coded speech information represent the values of LPC coefficients and thus a new approach for estimating missing LPC filter coefficients is presented in this paper. This approach employs a new formulation of LSP recovery system architecture where evolving fuzzy rule-based models and particularly so-called evolving Takagi-Sugeno models are deployed to generate the required estimates of missing LSPs. The proposed missing parameters estimation technique is generic and initial experimental results demonstrate its considerable potential in improving the quality of LPC based decoded speech in VoIP applications
暂无评论