This paper describes polynomial kernel subspace approach to Isolated Word Recognition (IWR) systems. linear predictive coding (LPC) coefficients derived from wavelet sub-bands of speech frame were used as features. Th...
详细信息
ISBN:
(纸本)9781424451043
This paper describes polynomial kernel subspace approach to Isolated Word Recognition (IWR) systems. linear predictive coding (LPC) coefficients derived from wavelet sub-bands of speech frame were used as features. This approach represents mapping of speech features (input space) into a feature space via a non-linear mapping onto the principal components called Kernel linear Discriminant Analysis (KLDA). The non-linear mapping between the input space and the feature space is implicitly performed using the kernel-trick. This nonlinear mapping using KLDA increases the discrimination ability of a pattern classifier. The use of Wavelet sub-band based LPC features (WLPC) provide low dimensional features which reduce the memory requirement and KLDA provides the fast classification and recognition. Experimental results obtained on isolated word database show that the proposed technique is computationally efficient and performs well with less training data.
In low bit-rate coders, the near-sample and far-sample redundancies of the speech signal are usually removed by a cascade of a short-term and a long-term linear predictor. These two predictors are usually found in a s...
详细信息
ISBN:
(纸本)9781424423538
In low bit-rate coders, the near-sample and far-sample redundancies of the speech signal are usually removed by a cascade of a short-term and a long-term linear predictor. These two predictors are usually found in a sequential and therefore suboptimal approach. In this paper we propose an analysis model that jointly finds the two predictors by adding a regularization term in the minimization process to impose sparsity constraints on a high order predictor. The result is a linear predictor that can be easily factorized into the short-term and long-term predictors. This estimation method is then incorporated into an Algebraic Code Excited linear Prediction scheme and shows to have a better performance than traditional cascade methods and other joint optimization methods, offering lower distortion and higher perceptual speech quality.
作者:
Merouane, BouzidUSTHB
Elect Fac Speech Commun & Signal Proc Lab Algiers 16111 Algeria
In this paper, an optimized trellis coded vector quantization (OTCVQ) system designed for efficient and robust coding of LSF spectral parameters is presented. The aim of this system, called at the beginning "LSF-...
详细信息
ISBN:
(纸本)9781424444564
In this paper, an optimized trellis coded vector quantization (OTCVQ) system designed for efficient and robust coding of LSF spectral parameters is presented. The aim of this system, called at the beginning "LSF-OTCVQ Encoder", is to achieve a low bit rate transparent quantization of the FS1016 LSF parameters. Once the effectiveness of the LSF-OTCVQ encoder was proven in the case of ideal transmissions over noiseless channel, we were interested after in the improvement of its robustness for real transmissions over noisy channel. To protect implicitly the transmission indices of the LSF-OTCVQ encoder incorporated in the FS1016, we used a joint source-channel coding carried out by the channel optimized vector quantization.
In recent studies the Unscented Kalman Filter (UKF) was applied to some nonlinear systems. Several speech processing problems like the estimation of the formant trajectories, the state and parameter Kalman estimation ...
详细信息
ISBN:
(纸本)9781424443451
In recent studies the Unscented Kalman Filter (UKF) was applied to some nonlinear systems. Several speech processing problems like the estimation of the formant trajectories, the state and parameter Kalman estimation for speech enhancement and the estimation of Line Spectral Frequency (LSF) trajectories. In this paper we apply the UKF to the estimation of LSF trajectories, in the case of synthetic and real noisy speech. The Expectation Maximization (EM) approach is used to iteratively estimate the LSF parameters. Furthermore, the Square-Root implementation of the UKF is used as it provides numeric stability and guarantees positive semi-definiteness of the state covariance.
Despite the great interest towards long term recordings of electromyographic (EMG) signals, which find applications, for example, in telemedicine, only a few studies have dealt with the compression of these signals. W...
详细信息
Despite the great interest towards long term recordings of electromyographic (EMG) signals, which find applications, for example, in telemedicine, only a few studies have dealt with the compression of these signals. We propose a lossy coding technique for surface EMG signals. The technique is based on the linear predictive coding paradigm widely used for speech compression. The algorithm was tested on both simulated and experimental signals. Mean frequency, median frequency, variance, skewness and kurtosis of the EMG signals were preserved with an error less than 3% with respect to the original values for synthetic signals and experimental signals, reducing the bitrate from 24 kbit/s (12 kbit/s after downsampling) to 352 bit/s, with a compression factor of 97.1%. It was concluded that the linear predictive coding paradigm can be effectively used for high rate compression of surface EMG signals when preservation of only the power spectrum of the signal is of interest. This has applications in ergonomics and occupational medicine.
To investigate the neural efficiency theory of intelligence, electroencephalograms (EEG) were recorded while 15 intellectually gifted children and 15 average children performed a 2-back working memory task. The amplit...
详细信息
ISBN:
(纸本)9781424445424
To investigate the neural efficiency theory of intelligence, electroencephalograms (EEG) were recorded while 15 intellectually gifted children and 15 average children performed a 2-back working memory task. The amplitude of P2, N2, and LPC were analyzed. The results showed that intellectually gifted children performed more accurately and had larger LPC mean amplitudes than their intellectually average peers under the matching condition, suggesting that intellectually gifted individual can use their brain and allocate cognitive resources more efficiently.
It is important to obtain effective feature values of data stream and forecast them in overload system for mining data stream, because data streams are often bursty and data characteristic vary over time. In this pape...
详细信息
It is important to obtain effective feature values of data stream and forecast them in overload system for mining data stream, because data streams are often bursty and data characteristic vary over time. In this paper, we introduce linear predictive coding (LPC) technology to obtain feature values using fewer coefficients. Generalized autoregressive conditional heteroscedastic (GARCH) -generalized regression neural network (GARCH-GRNN) model is used to forecast the feature values of which the data streams are shed, and we perform similarity search using these forecasting values. A load shedding framework based on LPC and GARCH-GRNN (LS-LG) for similarity search on data stream is constructed to achieve minimized mining loss. Experimental results indicate that LS-LG is an effective method in improving query quality when the system is under overload situation.
Speech analysis and synthesis is an important technology in the audio processing. In this paper, a modified speech analysis and synthesis system is proposed. It adopts more abundant excitation signal which include smo...
详细信息
Speech analysis and synthesis is an important technology in the audio processing. In this paper, a modified speech analysis and synthesis system is proposed. It adopts more abundant excitation signal which include smooth part and compensation impulse. The smooth part is the envelope curve of unvoiced sounds and the compensation impulse is represented by the peaks during pitch period. Compared with the single impulse that repeats with pitch period and the pure white noise, these signals can reflect the tone changes and characters in the period of original speech more accurately. And they make it possible for modified system to synthesize high quality speech by employing less order filter. The experimental result indicates that the modified system has advance in both speech quality and intelligibility without increasing much in code rate. For most speech, it has a good effect on recovery.
An efficient Immittance Spectral Frequency (ISF) parameters quantization algorithm is proposed based on the Gaussian mixture model (GMM). The basic idea of the algorithm is the use of GMM to send the ISF parameters in...
详细信息
An efficient Immittance Spectral Frequency (ISF) parameters quantization algorithm is proposed based on the Gaussian mixture model (GMM). The basic idea of the algorithm is the use of GMM to send the ISF parameters into M Gaussian clusters, ISF parameters are quantized by a Gaussian lattice vector quantizer corresponding to that Gaussian clustering, and the minimal spectral distortion value among the M quantized values is selected at last. In the design of Gaussian lattice vector quantizer, the optimal bit allocation algorithm is proposed based on the ratedistortion theory. The results show that the ISF parameters could be transparently quantized at 42 bit/frame, which saves 3 bits and reduce 58% of the storage compared with the Split - Multi-Stage Vector Quantization (S-MSVQ) algorithm of AMRWB(G.722.2).
暂无评论