Previous studies of nonlinear prediction of speech have been mostly focused on short-term prediction. This paper presents long-term nonlinear prediction based on second-order Volterra filters. It will be shown that th...
详细信息
Previous studies of nonlinear prediction of speech have been mostly focused on short-term prediction. This paper presents long-term nonlinear prediction based on second-order Volterra filters. It will be shown that the presented predictor can outperform conventional linear prediction techniques in terms of prediction gain and "whiter" residuals.
In order to reduce the false alarm rate and missed detection rate of a Loose Parts Monitoring System (LPMS) for Nuclear Power Plants, a new hybrid method combining linear predictive coding (LPC) and Support Vector Mac...
详细信息
In order to reduce the false alarm rate and missed detection rate of a Loose Parts Monitoring System (LPMS) for Nuclear Power Plants, a new hybrid method combining linear predictive coding (LPC) and Support Vector Machine (SVM) together to discriminate the loose part signal is proposed. The alarm process is divided into two stages. The first stage is to detect the weak burst signal for reducing the missed detection rate. Signal is whitened to improve the SNR, and then the weak burst signal can be detected by checking the short-term Root Mean Square (RMS) of the whitened signal. The second stage is to identify the detected burst signal for reducing the false alarm rate. Taking the signal's LPC coefficients as its characteristics, SVM is then utilized to determine whether the signal is generated by the impact of a loose part. The experiment shows that whitening the signal in the first stage can detect a loose part burst signal even at very low SNR and thusly can significantly reduce the rate of missed detection. In the second alarm stage, the loose parts' burst signal can be distinguished from pulse disturbance by using SVM. Even when the SNR is -15 dB, the system can still achieve a 100% recognition rate.
In this article, new feature extraction methods, which utilize wavelet decomposition and reduced order linear predictive coding (LPC) coefficients, have been proposed for speech recognition. The coefficients have been...
详细信息
In this article, new feature extraction methods, which utilize wavelet decomposition and reduced order linear predictive coding (LPC) coefficients, have been proposed for speech recognition. The coefficients have been derived from the speech frames decomposed using discrete wavelet transform. LPC coefficients derived from subband decomposition (abbreviated as WLPC) of speech frame provide better representation than modeling the frame directly. The WLPC coefficients have been further normalized in cepstrum domain to get new set of features denoted as wavelet subband cepstral mean normalized features. The proposed approaches provide effective (better recognition rate), efficient (reduced feature vector dimension), and noise robust features. The performance of these techniques have been evaluated on the TI-46 isolated word database and own created Marathi digits database in a white noise environment using the continuous density hidden Markov model. The experimental results also show the superiority of the proposed techniques over the conventional methods like linearpredictive cepstral coefficients, Mel-frequency cepstral coefficients, spectral subtraction, and cepstral mean normalization in presence of additive white Gaussian noise.
We described the low-complexity PARCOR coder de- signed for entropy coding of prediction residual signals. The quantization of PARCOR coefficients is based on a criterion that minimizes the entropy of the prediction r...
详细信息
We described the low-complexity PARCOR coder de- signed for entropy coding of prediction residual signals. The quantization of PARCOR coefficients is based on a criterion that minimizes the entropy of the prediction residual signals. The devised NLBS algorithm is included in the G.711.0 standard because this tool shows efficient performance in terms of bit reduction and complexity. It will be widely used in the near future because G.711.0 with the described low- complexity tool can compress the data rate of G.711, the prevailing speech coding technology.
An electrocardiogram (ECG) reconstruction method based on a linear prediction technique is proposed in this paper. The method can reconstruct a rather long missing parts of ECG signals. Each missing data segment may c...
详细信息
An electrocardiogram (ECG) reconstruction method based on a linear prediction technique is proposed in this paper. The method can reconstruct a rather long missing parts of ECG signals. Each missing data segment may cover 1 to 8 beats. The data used in the experiments are from the MIT-BIH normal sinus rhythm database. The experimental results show that our method can perform very well. The reconstructed signals are visually very close to the ground truths. The numerical evaluation also shows that the proposed method yields good results on the heart rate variability (HRV) measure derivation. It gives the time-domain HRV measures that are very close to the ground truths. Its performance is also better than the method commonly used by experts in which the abnormal beats are removed before calculating the HRV measures.
Speech compression, enhancement and recognition in noisy, reverberant conditions is a challenging task. In this paper a new approach to this problem, which is developed in the framework of probabilistic random modelin...
详细信息
ISBN:
(纸本)9783642274428
Speech compression, enhancement and recognition in noisy, reverberant conditions is a challenging task. In this paper a new approach to this problem, which is developed in the framework of probabilistic random modeling, speech coding techniques are commonly used in low bit rate analysis and synthesis. coding algorithms seek to minimize the bit rate in the digital representation of a signal without an objectionable loss of signal quality in the process. Speech enhancement aims to improve speech quality by using various algorithms This paper deals with multistage vector quantization technique used for coding of narrow band speech signals. The parameter used for coding of speech signals are the line spectral frequencies, so as to ensure filter stability after quantization. A new approach incorporates the information about statistical random nature of uncompressed speech signal using LBG algorithm. The code books used for quantization are generated by using Linde, Buzo and Gray(LBG) algorithm. Speech model is characterized by LPC coefficients and parameterized by the coefficients of the reverberation filters The results of the multistage vector quantizer are compared with unconstrained vector quantization Technique. The performance of quantization is measured in terms of spectral distortion measured in dB, Computational complexity measured in KFlops and Memory Requirements measured in Floats. From the results it can be proved that multistage vector quantization is having better spectral distortion performance, less computational complexity and memory requirements when compared to unconstrained vector quantization. The proposed approach yields significantly estimating the parameters from the data, better performance in both signal to noise ratio and subjective filter methods
Vector precoding enables non-cooperative signal acquisition in the multi-user broadcast channel. The performance advantage with respect to the more straightforward linear precoding algorithms comes as a consequence of...
详细信息
ISBN:
(纸本)9781467350518
Vector precoding enables non-cooperative signal acquisition in the multi-user broadcast channel. The performance advantage with respect to the more straightforward linear precoding algorithms comes as a consequence of an added perturbation vector, which enhances the properties of the precoded signal. Nevertheless, the computation of the perturbation signal entails a search for the closest point in an infinite lattice, which is known to belong in the class of non-deterministic polynomial-time hard (NP-hard) problems. This contribution presents a novel tree search scheme that achieves an error-rate performance that is close to the optimum given by the sphere encoder, but with a significantly simpler tree-search structure that only considers the most promising nodes for expansion. With the aim of better showcasing the low-complexity and simple datapath of the proposed tree-search technique, its hardware implementation on a 65 nm ASIC target device has been performed.
Accent is a major cause of variability in automatic speaker-independent speech recognition systems. Under certain circumstances, this event introduces unsatisfactory performance of the systems. In order to circumvent ...
详细信息
ISBN:
(纸本)9781467330336;9781467323024
Accent is a major cause of variability in automatic speaker-independent speech recognition systems. Under certain circumstances, this event introduces unsatisfactory performance of the systems. In order to circumvent this deficiency, accent analyzer in preceding stage could be a smart solution. This paper proposes a rather new approach of hybrid way to optimize the extraction of accent from speech utterances over other facets using linearpredictive coefficients (LPC) derived from discrete wavelet transform (DWT). The constructed features were used to model an accent recognizer, implemented based on K-nearest neighbors. Experimental results showed that the hybrid dyadic-X DWT-LPC features were highly correlated to the Malay, Chinese and Indian accents of Malaysian English speakers through an increase of classification rate of 9.28% over the conventional LPC method.
linear predictive coding (LPC) residual is an important component for analysis-by-synthesis speech coders. Using spectral envelope vector quantisation (SEVQ) for LPC residual signals, MPEG-4 harmonic vector excitation...
详细信息
linear predictive coding (LPC) residual is an important component for analysis-by-synthesis speech coders. Using spectral envelope vector quantisation (SEVQ) for LPC residual signals, MPEG-4 harmonic vector excitation coding (HVXC) speech coder is able to change the tone and speed for decoding bitstreams. A modified version is proposed to the simplified hearing-based SEVQ (SSEVQ) approach proposed by Wang and Yang. In addition, the energy first search scheme for SEVQ (EFSEVQ) is proposed in order to reduce the computational complexity of SEVQ for MPEG-4 HVXC speech coder. The proposed EFSEVQ scheme also agrees with human hearing properties. Simulation results reveal that the proposed EFSEVQ search scheme not only reduces the computational complexity but also preserves the quality of encoded speech.
暂无评论