A new linear prediction analysis method for multichannel signals was devised. with the goal of enhancing the compression performance of the MPEG-4 Audio Lossless coding (ALS) compliant encoder and decoder. The multich...
详细信息
A new linear prediction analysis method for multichannel signals was devised. with the goal of enhancing the compression performance of the MPEG-4 Audio Lossless coding (ALS) compliant encoder and decoder. The multichannel coding tool for this standard carries out an adaptively weighted subtraction of the residual signals of the coding channel from those of the reference channel, both of which are produced by independent linear prediction. Our linear prediction method tries to directly minimize the amplitude of the predicted residual signal after subtraction of the signals of the coding channel, and the method has been implemented in the MPEG-4 ALS codec software. The results of a comprehensive evaluation show that this method reduces the size of a compressed file. The maximum improvement of the compression ratio is 14.6% which is achieved at the cost of a small increase in computational complexity at the encoder and without increase in decoding time. This is a practical method because the compressed bitstream remains compliant with the MPEG-4 ALS standard.
A digital filter interpolation of decoded line spectral frequencies (LSFs) that significantly outperforms linear interpolation for large vocabulary distributed continuous speech recognition systems is presented. Exper...
详细信息
A digital filter interpolation of decoded line spectral frequencies (LSFs) that significantly outperforms linear interpolation for large vocabulary distributed continuous speech recognition systems is presented. Experiments were conducted using linear predictive coding (LPC) and LSF-derived speech recognition features, CDHMM acoustic models, triphone units and trigram language models for Brazilian Portuguese.
Objective: To study the feasibility of using acoustic signatures in snore signals for the diagnosis of obstructive sleep apnea (OSA). Methods: Snoring sounds of 30 apneic snorers (24 males;6 females: apnea-hypopnea in...
详细信息
Objective: To study the feasibility of using acoustic signatures in snore signals for the diagnosis of obstructive sleep apnea (OSA). Methods: Snoring sounds of 30 apneic snorers (24 males;6 females: apnea-hypopnea index, AHI = 46.9 +/- 25.7 events/h) and 10 benign snorers (6 males;4 females;AHI = 4.6 +/- 3.4 events/h) were captured in a sleep laboratory. The recorded snore signals were preprocessed to remove noise, and subsequently, modeled using a linear predictive coding (LPC) technique. Formant frequencies (F1, F2, and F3) were extracted from the LPC spectrum for analysis. The accuracy of this approach was assessed using receiver operating characteristic curves and notched box plots. The relationship between AHI and F1 was further explored via regression analysis. Results: Quantitative differences in formant frequencies between apneic and benign snores are found in same- or both-gender snorers. Apneic snores exhibit higher formant frequencies than benign snores, especially F1, which can be related to the pathology of OSA. This study yields a sensitivity of 88%, a specificity of 82%, and a threshold value of F1 = 470 Hz that best differentiate apneic snorers from benign snorers (both gender combined). Conclusion: Acoustic signatures in snore signals carry information for OSA diagnosis, and snore-based analysis might potentially be a non-invasive and inexpensive diagnostic approach for mass screening of OSA. (c) 2007 Elsevier B.V. All rights reserved.
In CELP coders, the past excitation signal used to build the adaptive codebook is known to be the main source of error propagation when a frame is lost. This paper presents a novel resynchronization technique using ve...
详细信息
ISBN:
(纸本)9781615673780
In CELP coders, the past excitation signal used to build the adaptive codebook is known to be the main source of error propagation when a frame is lost. This paper presents a novel resynchronization technique using very low bit rate side information to correct the past excitation signal after a frame erasure, the novelty being that the correction is computed in a closed loop fashion, based on the actual error introduced by the concealment. Subjective test results show that this approach is a promising area for future research on frame loss recovery.
Modifications to IP based packet network protocols are examined that would make the network tolerant of bit errors in packet payloads or headers. These modifications are tested with communication quality MELP voice tr...
详细信息
ISBN:
(纸本)9781424414833
Modifications to IP based packet network protocols are examined that would make the network tolerant of bit errors in packet payloads or headers. These modifications are tested with communication quality MELP voice traffic. As measured by a PESQ score, improvements in the perceptual quality of the speech are noted that are maximized when error checking is disabled for the entire packet.
In this paper, a non-linear spectral estimation for noise reduction is present which is approximated and implemented by double Radial Basis Function (RBF) networks. The simulation results indicate that the method can ...
详细信息
ISBN:
(纸本)9780769531199
In this paper, a non-linear spectral estimation for noise reduction is present which is approximated and implemented by double Radial Basis Function (RBF) networks. The simulation results indicate that the method can greatly improve the quality and the intelligibility of speech, and have other advantages such as the widely applicable Signal-to-Noise Ratio (SNR) range, less computation load Particularly the method may maintain the preferable accurate of signal in speech waveform, and the quality of speech signals have been improved obviously.
A delay-free audio coding scheme based on ADPCM with adaptive pre- and post-filtering is presented. The pre-/post-filters are realized as a cascade of shelving filters, designed to match the characteristics of human p...
详细信息
ISBN:
(纸本)9781424414833
A delay-free audio coding scheme based on ADPCM with adaptive pre- and post-filtering is presented. The pre-/post-filters are realized as a cascade of shelving filters, designed to match the characteristics of human perception. The pre- and post-filters are adapted by dynamic compression of the respective sub-bands. The adaption is backward-adaptive, i.e. is fed by the reconstructed signal, which eliminates the need to transmit the filter coefficients and allows delay-free operation. This pre- and post-filtering significantly improves the audio quality compared to a plain ADPCM codec, as underlined by objective measurements. Since the base ADPCM used is also delay-free, the resulting coding system works without any algorithmic delay.
Voiced/Unvoiced (V/U) classification is an important parameter in low bit-rate speech coding algorithms. An algorithm that recovers the V/U classification from the linear prediction coding (LPC) coefficients and the g...
详细信息
ISBN:
(纸本)9781424421787
Voiced/Unvoiced (V/U) classification is an important parameter in low bit-rate speech coding algorithms. An algorithm that recovers the V/U classification from the linear prediction coding (LPC) coefficients and the gain in the speech decoder is proposed. Two Gaussian mixture models (GMM) are employed to model the joint probability of these parameters and to perform the V/U estimation. Experiments show the performance improvements of the proposed algorithm over the V/U classifier used in mixed excitation LPC vocoder (MELP). The proposed algorithm operates only at the receiving end and saves all the bits originally used for V/U quantization.
This paper examines the efficient quantization of LSP parameters for very low bit rate vocoder below 300bps, a new quantization scheme called variable dimension matrix quantization (VDMQ) is presented In the VDMQ sche...
详细信息
ISBN:
(纸本)9781424421787
This paper examines the efficient quantization of LSP parameters for very low bit rate vocoder below 300bps, a new quantization scheme called variable dimension matrix quantization (VDMQ) is presented In the VDMQ scheme, the extracted LSP parameters matrix with variable dimension is quantized directly without dimension conversion. Based on the distance measure definition between low LSP matrices with different dimension, the optimal codeword is deduced Theoretical analysis and experiment results show that the VDMQ scheme performs better than the segment quantization and matrix quantization scheme at very low bit rate. Also, the codebook storage is almost reduced by 90%. The VDMQ scheme provides a new effective approach for efficient LSP parameters quantization at very low bit rate.
In this paper a new feature extraction methods, which utilize reduced order linear predictive coding (LPC) coefficients for speech recognition, have been proposed The coefficients have been derived from the speech fra...
详细信息
ISBN:
(纸本)9781424424085
In this paper a new feature extraction methods, which utilize reduced order linear predictive coding (LPC) coefficients for speech recognition, have been proposed The coefficients have been derived from the speech frames decomposed using Discrete Wavelet Transform (DWT). In the literature it is assumed that the speech frame of size 10 msec to 30 msec is stationary, however, in practice different parts of the speech signal may convey different amount of information (hence may not be perfectly stationary). LPC coefficients derived from subband decomposition of speech frame provide better representation than modeling the frame directly. Experimentally it has been shown that, the proposed approaches provide effective (better recognition rate) and efficient (reduced feature vector dimension) features. The speech recognition system using the continuous Hidden Markov Model (HMM) has been implemented. The proposed algorithms are evaluated using NIST TI-46 isolated-word database.
暂无评论