We have investigated the QRS complex, extracted from electrocardiogram (EGG) data, using fuzzy adaptive resonance theory mapping (ARTMAP) to classify cardiac arrhythmias. Two different conditions have been analyzed: n...
详细信息
We have investigated the QRS complex, extracted from electrocardiogram (EGG) data, using fuzzy adaptive resonance theory mapping (ARTMAP) to classify cardiac arrhythmias. Two different conditions have been analyzed: normal and abnormal premature ventricular contraction (PVC), Based on MIT/BIH database annotations, cardiac beats for normal and abnormal QRS complexes were extracted from this database, scaled, and Hamming windowed, after bandpass filtering, to yield a sequence of 100 samples for each QRS segment, From each of these sequences, two linear predictive coding (LPC) coefficients were generated using Burg's maximum entropy method, The two LPC coefficients, along with the mean-square value of the QRS complex segment, were utilized as features for each condition to train and test a fuzzy ARTMAP neural network for classification of normal and abnormal PVC conditions, The test results show that the fuzzy ARTMAP neural network can classify cardiac arrhythmias with greater than 99% specificity and 97% sensitivity.
Recently, it has been shown that good quality speech at rates as low as 6 Kbit/s can be achieved with CELP and its derivatives. However, in bringing down the bit rate even further these coding schemes have resorted to...
详细信息
Recently, it has been shown that good quality speech at rates as low as 6 Kbit/s can be achieved with CELP and its derivatives. However, in bringing down the bit rate even further these coding schemes have resorted to allocating fewer bits to the quantisation of the LPC parameters. It is known that CELP and its derivatives are sensitive to LPC parameters quantisation errors, and recently various schemes have been proposed to overcome this degradation. In the paper we describe how speaker adaptive vector quantisation (SAVQ) can be applied to the quantisation of LPC parameters and assess its performance when incorporated into a low bit rate coding scheme.
A method is presented to determine an appropriate portion that could be subtracted from the noise-contaminated autocorrelation functions before carrying out linear prediction analysis in coloured noise. This method gu...
详细信息
A method is presented to determine an appropriate portion that could be subtracted from the noise-contaminated autocorrelation functions before carrying out linear prediction analysis in coloured noise. This method guarantees the stability of the resulting linear prediction filter. The authors demonstrate the improvement using an objective measure based on a synthetic vowel.
Speech coders employing forward adaptive predictivecoding (APC) and operating at medium-to-low bit rates necessitate efficient encoding of the linear predictive coding (LPC) coefficients. Line spectrum pair (LSP) par...
详细信息
Speech coders employing forward adaptive predictivecoding (APC) and operating at medium-to-low bit rates necessitate efficient encoding of the linear predictive coding (LPC) coefficients. Line spectrum pair (LSP) parameters are currently one of the most efficient choices of transmission parameters for the LPC coefficients. This paper briefly reviews LSP parameters and presents several low delay coding schemes for the parameters. The coders are simulated using data generated from both the autocorrelation and covariance LPC analysis methods. The performances of the coders are given for a variety of rates and LPC analysis conditions. The most efficient scheme developed herein uses a predictive form of trellis coded quantization (TCQ). Its performance is comparable or superior to that of other low delay LSP coding schemes in the literature. An enumeration scheme that reduces the rate of a given scalar quantization structure without decreasing coder performance is also presented.
Spectral moments (mean and coefficients of variation, skewness, and kurtosis) are assessed for 40 samples from 10 groups of acoustic transient signals differing in harmonic structure, duration, and degree of spectral ...
详细信息
Spectral moments (mean and coefficients of variation, skewness, and kurtosis) are assessed for 40 samples from 10 groups of acoustic transient signals differing in harmonic structure, duration, and degree of spectral overlap. Discri+minant analysis involving moments based on linear predictive coding (LPC) resulted in a higher recognition rate for pulsed-tone sounds (87%) that were more like human speech than for pure-tone sounds (70%). By contrast, classification based on moments calculated from the discrete Fourier transform (DFT) yielded 85% recognition for both groups. Cluster analyses indicated that LPC-based moments were more characteristic of relationships among the 10 sound groups and especially the 2 tonal groups, though results were somewhat dependent on LPC model order.
In this paper, we propose a Bayesian minimum mean squared error approach for the joint estimation of the short-term predictor parameters of speech and noise, from the noisy observation. We use trained codebooks of spe...
详细信息
In this paper, we propose a Bayesian minimum mean squared error approach for the joint estimation of the short-term predictor parameters of speech and noise, from the noisy observation. We use trained codebooks of speech and noise linearpredictive coefficients to model the a priori information required by the Bayesian scheme. In contrast to current Bayesian estimation approaches that consider the excitation variances as part of the a priori information, in the proposed method they are computed online for each short-time segment, based on the observation at hand. Consequently, the method performs well in nonstationary noise conditions. The resulting estimates of the speech and noise spectra can be used in a Wiener filter or any state-of-the-art speech enhancement system. We develop both memoryless (using information from the current frame alone) and memory-based (using information from the current and previous frames) estimators. Estimation of functions of the short-term predictor parameters is also addressed, in particular one that leads to the minimum mean squared error estimate of the clean speech signal. Experiments indicate that the scheme proposed in this paper performs significantly better than competing methods.
It is a classical result of linear prediction theory that as long as the minimum prediction error variance is nonzero, the transfer function of the optimum linear prediction error filter for a stationary process is mi...
详细信息
It is a classical result of linear prediction theory that as long as the minimum prediction error variance is nonzero, the transfer function of the optimum linear prediction error filter for a stationary process is minimum phase, and therefore, its inverse is exponentially stable. Here, extensions of this result to the case of nonstationary processes are investigated. Tn that context, the filter becomes time varying, and the concept of "transfer function" ceases to make sense. Nevertheless, are prove that under mild condition on the input process, the inverse system remains exponentially stable. We also consider filters obtained in a deterministic framework and show that if the time-varying coefficients of the predictor are computed by means of the recursive weighted least squares algorithm, then its inverse remains exponentially stable under a similar set of conditions.
In this article, new feature extraction methods, which utilize wavelet decomposition and reduced order linear predictive coding (LPC) coefficients, have been proposed for speech recognition. The coefficients have been...
详细信息
In this article, new feature extraction methods, which utilize wavelet decomposition and reduced order linear predictive coding (LPC) coefficients, have been proposed for speech recognition. The coefficients have been derived from the speech frames decomposed using discrete wavelet transform. LPC coefficients derived from subband decomposition (abbreviated as WLPC) of speech frame provide better representation than modeling the frame directly. The WLPC coefficients have been further normalized in cepstrum domain to get new set of features denoted as wavelet subband cepstral mean normalized features. The proposed approaches provide effective (better recognition rate), efficient (reduced feature vector dimension), and noise robust features. The performance of these techniques have been evaluated on the TI-46 isolated word database and own created Marathi digits database in a white noise environment using the continuous density hidden Markov model. The experimental results also show the superiority of the proposed techniques over the conventional methods like linearpredictive cepstral coefficients, Mel-frequency cepstral coefficients, spectral subtraction, and cepstral mean normalization in presence of additive white Gaussian noise.
This paper introduces noncausal all-pole models that are capable of efficiently capturing both the magnitude and phase information of voiced speech, It is shown that noncausal all-pole filter models are better able to...
详细信息
This paper introduces noncausal all-pole models that are capable of efficiently capturing both the magnitude and phase information of voiced speech, It is shown that noncausal all-pole filter models are better able to match both magnitude and phase information and are particularly appropriate for voiced speech due to the nature of the glottal excitation. By modeling speech in the frequency domain, the standard difficulties that occur when using noncausal all-pole filters are avoided. Several algorithms for determining the model parameters based on frequency-domain information and the masking effects of the ear are described. Our work suggests that high-quality voiced speech can be produced using a 14th-order noncausal all-pole model.
Staggered synthetic aperture radar (SAR) is an innovative SAR acquisition concept which exploits digital beam-forming (DBF) in elevation to form multiple receive beams and continuous variation of the pulse repetition ...
详细信息
Staggered synthetic aperture radar (SAR) is an innovative SAR acquisition concept which exploits digital beam-forming (DBF) in elevation to form multiple receive beams and continuous variation of the pulse repetition interval to achieve high-resolution imaging of a wide continuous swath. Staggered SAR requires an azimuth oversampling higher than an SAR with constant pulse repetition interval (PRI), which results in an increased volume of data. In this article, we investigate the use of linear predictive coding, which exploits the correlation properties exhibited by the nonuniform azimuth raw data stream. According to this, the prediction of each sample is calculated onboard as a linear combination of a set of previous samples. The resulting prediction error is then quantized and downlinked (instead of the original value), which allows for a reduction of the signal entropy and, in turn, of the onboard data rate achievable for a given target performance. In addition, the a priori knowledge of the gap positions can be exploited to dynamically adapt the bit rate allocation and the prediction order to further improve the performance. Simulations of the proposed dynamic predictive block-adaptive quantization (DP-BAQ) are carried out considering a Tandem-L-like staggered SAR system for different orders of prediction and target scenarios, demonstrating that a significant data reduction can be achieved with a modest increase of the system complexity.
暂无评论