The authors propose an adaptively weighted Itakura distortion measure. They studied its effects on the performance of a conventional dynamic time-warping (DTW)-based speech recognizer in a series of speaker-independen...
详细信息
The authors propose an adaptively weighted Itakura distortion measure. They studied its effects on the performance of a conventional dynamic time-warping (DTW)-based speech recognizer in a series of speaker-independent, isolated-digit-recognition experiments. The equivalent SNR improvement achieved by using the proposed weighted Itakura distortion at low SNRs is about 5-7 dB.< >
In this correspondence, we present two findings regarding an algorithm proposed by Milenkovic for determining the source and transfer function of the human vocal system. First, we show that more reliable estimates of ...
详细信息
In this correspondence, we present two findings regarding an algorithm proposed by Milenkovic for determining the source and transfer function of the human vocal system. First, we show that more reliable estimates of the glottal endpoints can be obtained by modifying the procedure used to update the initial endpoint estimates. Second, we show that the algorithm is useful only for analyzing sounds of low to moderate pitch, because of its dependence on an initial transfer function estimate obtained by linear prediction.
Genomic signal processing is a new area of research that combines advanced digital signal processing methodologies for enhanced genetic data analysis. It has many promising applications in bioinformatics and next gene...
详细信息
Genomic signal processing is a new area of research that combines advanced digital signal processing methodologies for enhanced genetic data analysis. It has many promising applications in bioinformatics and next generation of healthcare systems, in particular, in the field of microarray data clustering. In this paper we present a comparative performance analysis of enhanced digital spectral analysis methods for robust clustering of gene expression across multiple microarray data samples. Three digital signal processing methods: linear predictive coding, wavelet decomposition, and fractal dimension are studied to provide a comparative evaluation of the clustering performance of these methods on several microarray datasets. The results of this study show that the fractal approach provides the best clustering accuracy compared to other digital signal processing and well known statistical methods.
作者:
Z. HeH. LiDSP Division
Radio Engineering Department South-East University Nanjing China
A multilayer neural network used for nonlinearpredictive image coding is described. Two coding schemes, nonadaptive and adaptive, are shown. Owing to matching of the local properties of the image, nonlinear predictiv...
详细信息
A multilayer neural network used for nonlinearpredictive image coding is described. Two coding schemes, nonadaptive and adaptive, are shown. Owing to matching of the local properties of the image, nonlinear predictive coding gives a better performance than linear predictive coding. A series of computer experiments shows the method has not only the ability to generalize but also noise reduction capabilities. Compared with differential pulse code modulation (DPCM), it greatly reduces the number of bits to be transmitted.< >
We present a new architecture called the Modular Neural predictivecoding architecture (Modular NPC). This architecture is used for speech discriminant feature extraction (DFE). We present an application of the modula...
详细信息
ISBN:
(纸本)9810475241
We present a new architecture called the Modular Neural predictivecoding architecture (Modular NPC). This architecture is used for speech discriminant feature extraction (DFE). We present an application of the modular NPC architecture on phoneme recognition task. The phonemes which are extracted from the Darpa-Timit speech database are: vowels, /b/-/d/-/g/ and /p/-/t/-/k/ phonemes. Comparisons with coding methods (LPC, MFCC, PLP) are presented.
We consider an algorithm for reduction of broadband noise in speech based on signal subspaces. The algorithm is formulated by means of the quotient singular value decomposition (QSVD). With this formulation, a prewhit...
详细信息
We consider an algorithm for reduction of broadband noise in speech based on signal subspaces. The algorithm is formulated by means of the quotient singular value decomposition (QSVD). With this formulation, a prewhitening operation becomes an integral part of the algorithm. We demonstrate that this is essential in connection with updating issues in real-time recursive applications. We also illustrate by examples that we are able to achieve a satisfactory quality of the reconstructed signal.
It has been recently demonstrated that the principles of vector quantization for LPC speech can be simply extended to encompass matrices of LPC vectors with significant savings in bit rate. Unfortunately, however, suc...
详细信息
It has been recently demonstrated that the principles of vector quantization for LPC speech can be simply extended to encompass matrices of LPC vectors with significant savings in bit rate. Unfortunately, however, such locally optimal matrix quantizers have prohibitively high complexity and memory requirements when implemented in a speech vocoder at bit rates giving acceptable quality speech. One approach to solving the problem is to separately code gain and shape in the matrix quantizer. This paper generalizes the principles of shape-gain vector quantizer design for LPC speech to matrix quantization and investigates the properties of the resulting quantizers. In particular, we present a design which combines shape matrices consisting of N shape vectors with K-dimensional gain vectors, where N and K are small integers, in practice, with K \geq N . Experimental results show that with K, N \geq 3 , significant reductions in bit rate over locally optimal vector quantizers are obtained for comparable performance. Simulations indicate that a shape-gain matrix quantizer, using a 10 bit shape codebook and an 8 bit codebook with K = N = 3 operating at 6 bits/frame for the LPC model, gives speech quality comparable to a locally optimal vector quantizer at 9 bits/frame. The matrix quantizer has somewhat greater than 5.7 times the memory requirement of the above vector quantizer, but less than 2.1 times the complexity. Subjective tests show that the speech from this matrix quantizer is intelligible to native speakers of English.
In order to reduce the false alarm rate and missed detection rate of a Loose Parts Monitoring System (LPMS) for Nuclear Power Plants, a new hybrid method combining linear predictive coding (LPC) and Support Vector Mac...
详细信息
In order to reduce the false alarm rate and missed detection rate of a Loose Parts Monitoring System (LPMS) for Nuclear Power Plants, a new hybrid method combining linear predictive coding (LPC) and Support Vector Machine (SVM) together to discriminate the loose part signal is proposed. The alarm process is divided into two stages. The first stage is to detect the weak burst signal for reducing the missed detection rate. Signal is whitened to improve the SNR, and then the weak burst signal can be detected by checking the short-term Root Mean Square (RMS) of the whitened signal. The second stage is to identify the detected burst signal for reducing the false alarm rate. Taking the signal's LPC coefficients as its characteristics, SVM is then utilized to determine whether the signal is generated by the impact of a loose part. The experiment shows that whitening the signal in the first stage can detect a loose part burst signal even at very low SNR and thusly can significantly reduce the rate of missed detection. In the second alarm stage, the loose parts' burst signal can be distinguished from pulse disturbance by using SVM. Even when the SNR is -15 dB, the system can still achieve a 100% recognition rate.
Low implementation complexity, low delay and close-to-optimal performance over a wide variety of channels are some of the advantages of spatially-coupled low-density parity-check (LDPC) codes. However, the error perfo...
详细信息
Low implementation complexity, low delay and close-to-optimal performance over a wide variety of channels are some of the advantages of spatially-coupled low-density parity-check (LDPC) codes. However, the error performance of the sliding window decoding scheme that is used to decode these codes is considerably degraded over channels with memory, such as the correlated erasure channel. Employing a block interleaver to encounter this situation is not always a viable option, since it introduces a large amount of delay and cancels out the low-delay property of the sliding window decoder. Another way to reduce the effects of erasure bursts is to construct a more robust code ensemble by presenting additional code design rules. However, this approach results in additional constraints on the already complicated code construction process. The authors propose a novel communication system that combats the effects of the erasure bursts through the use of a convolutional interleaver. The proposed system combines the inherent convolutional nature of the spatially-coupled LDPC codes with that of a convolutional interleaver to achieve very low overall delay. The performance of the proposed approach is analysed using the density evolution technique and the performance improvement is demonstrated as a function of the interleaving delay via computer simulations.
In this paper, we propose a Bayesian minimum mean squared error approach for the joint estimation of the short-term predictor parameters of speech and noise, from the noisy observation. We use trained codebooks of spe...
详细信息
In this paper, we propose a Bayesian minimum mean squared error approach for the joint estimation of the short-term predictor parameters of speech and noise, from the noisy observation. We use trained codebooks of speech and noise linearpredictive coefficients to model the a priori information required by the Bayesian scheme. In contrast to current Bayesian estimation approaches that consider the excitation variances as part of the a priori information, in the proposed method they are computed online for each short-time segment, based on the observation at hand. Consequently, the method performs well in nonstationary noise conditions. The resulting estimates of the speech and noise spectra can be used in a Wiener filter or any state-of-the-art speech enhancement system. We develop both memoryless (using information from the current frame alone) and memory-based (using information from the current and previous frames) estimators. Estimation of functions of the short-term predictor parameters is also addressed, in particular one that leads to the minimum mean squared error estimate of the clean speech signal. Experiments indicate that the scheme proposed in this paper performs significantly better than competing methods.
暂无评论