The transmission of speech in mobile or packet networks requires the use of a speech codec. In order to improve the quality of speech in a noisy environment, a noise reduction algorithm is used. This noise reduction c...
详细信息
The transmission of speech in mobile or packet networks requires the use of a speech codec. In order to improve the quality of speech in a noisy environment, a noise reduction algorithm is used. This noise reduction can either be done as pre-processing before speech encoding or in the network by decoding the bitstream, performing the speech enhancement in the time and/or frequency domain and re-encoding the speech. Both methods are computationally expensive. In this paper a new approach to reduce environmental background noise by modifying the codec parameters is discussed.
In this paper, we proposed using MFCC coefficients (mel-scaled cepstral coefficients) and a simple but efficient classifying method: vector quantification (VQ) to perform speaker-dependent emotion recognition. Many ot...
详细信息
In this paper, we proposed using MFCC coefficients (mel-scaled cepstral coefficients) and a simple but efficient classifying method: vector quantification (VQ) to perform speaker-dependent emotion recognition. Many other features: energy, pitch, zero crossing, phonetic rate, LPC... and their derivatives are also tested and combined with MFCC coefficients in order to find the best combination. Other models, GMM and HMM (discrete and continuous hidden Markov model), are studied as well in the hope that the use of continuous distribution and the temporal evolution of this set of features will improve the quality of emotion recognition. The accuracy recognizing five different emotions exceeds 80% by using only MFCC coefficients with VQ model. This is a simple but efficient approach, the result is even much better than those obtained with the same database in human evaluations by listening and judging without returning permission nor comparisons between sentences (Inger Samso Engberg and Anya Varnich Hansen, 2001).
Two extension tools for enhancing the compression performance of prediction-based lossless audio coding are proposed. One is progressive-order prediction of the starting samples at the random access points, where the ...
详细信息
Two extension tools for enhancing the compression performance of prediction-based lossless audio coding are proposed. One is progressive-order prediction of the starting samples at the random access points, where the information of previous samples is not available. The first sample is coded as is, the second is predicted by first-order prediction, the third is predicted by second-order prediction, and so on. This can be efficiently carried out with PAR-COR (PARtial autoCORrelation) coefficients. The second tool is interchannel joint coding. Both predictive coefficients and prediction error signals are efficiently coded by interchannel differential or three-tap adaptive prediction. These new prediction tools lead to a steady reduction in bit rate when random access is activated and the interchannel correlation is strong.
Lossless coding is to become the latest extension of the MPEG-4 audio standard. In response to a call for proposals, many companies have submitted lossless audio codecs for evaluation. The codec of the Technical Unive...
详细信息
Lossless coding is to become the latest extension of the MPEG-4 audio standard. In response to a call for proposals, many companies have submitted lossless audio codecs for evaluation. The codec of the Technical University of Berlin was chosen as reference model for MPEG-4 audio lossless coding (ALS), attaining working draft status in July 2003. The encoder is based on linear prediction, which enables high compression even with moderate complexity, while the corresponding decoder is straightforward. The paper describes the basic elements of the codec, points out envisaged applications, and gives an outline of the standardization process.
In this paper basic technologies intervening in the design of recognition system of Arab isolated words based on statistical modeling by the continuous HMMs (hidden Markov model) is presented. A comparative study base...
详细信息
In this paper basic technologies intervening in the design of recognition system of Arab isolated words based on statistical modeling by the continuous HMMs (hidden Markov model) is presented. A comparative study based on the methods of analysis for the parameterization phase and the effect of the number of Gaussians and states for the training phase is studied. The best rate of recognition is obtained by the use of the differential parameters of the first order (MFCC).
The tracking method based on local position color information (LPC) is a subregion model matching approach. A new color representation method is proposed firstly to eliminate illumination efficiently. And then the tar...
详细信息
The tracking method based on local position color information (LPC) is a subregion model matching approach. A new color representation method is proposed firstly to eliminate illumination efficiently. And then the target region is divided into similar color subregion, in each subregion the histogram is calculated in certain color range, furthermore, the target can be represented by the position-weighted histogram. At last, the minimum cross entropy method is used as the similarity function between the target model and the target candidates. The target tracking algorithm has better effect in the experimental result.
Radiotherapy treatments become more and more accurate, using techniques like IMRT. Their irradiation fields and dose depositions are small and complex, and only a few dosimeters are available for real time and in vivo...
详细信息
Radiotherapy treatments become more and more accurate, using techniques like IMRT. Their irradiation fields and dose depositions are small and complex, and only a few dosimeters are available for real time and in vivo control for photons as well as for electrons beams. In this context, a new scintillating fiber dosimeter (SFD) has been developed by the "Laboratoire de Physique Corpusculaire de Caen", LPC Caen (France) in collaboration with one of the French regional center for cancer treatment, "Centre regional de lutte contre cancer ***" CRLCC F. Baclesse in Caen, and the ELDIM Company in Herouville (France). This plastic dosimeter is water equivalent, and it is suitable for photons as well as for electrons beams without correction. It is a real time dosimeter, with an excellent signal to noise ratio, and a spatial resolution of about a few millimeters. Recently, a new light collection device has been developed to improve the spatial resolution to 1 mm without loss in the signal to noise ratio. The accuracy of this improved prototype has been tested by comparison with standard ionization chambers and the difference between the two devices never exceeded one percent for photon and for electron irradiation beams. A first set of commercial SFD is under completion at ELDIM and it will be soon clinically tested in several French centers for cancer treatment.
The performance of LPC based algorithm deteriorates significantly in the presence of background noise. The present study proposes a new approach based on orthogonal least squares (OLS) algorithm with structure selecti...
详细信息
The performance of LPC based algorithm deteriorates significantly in the presence of background noise. The present study proposes a new approach based on orthogonal least squares (OLS) algorithm with structure selection to obtain unbiased LPC parameters from noisy speech samples. Instead of fitting a fixed order model to all segments of speech, the algorithm selects the best possible model order for a given speech segment using an error reduction ratio (ERR) test. A noise model is appended to the conventional LPC model to make the LPC parameters unbiased. The proposed algorithm gives superior performance compared to the commonly used LPC based algorithm under high levels of noise.
Detection of visual evoked potentials (VEP) elicited by repetitive stimuli is valuable in both laboratorial research and clinical practice. Therefore, knowing the characteristics of VEPs is of fundamental importance f...
详细信息
Detection of visual evoked potentials (VEP) elicited by repetitive stimuli is valuable in both laboratorial research and clinical practice. Therefore, knowing the characteristics of VEPs is of fundamental importance for adequate design of a signal detector. Usually, the signal is modeled as a steady-state VEP (ssVEP) consisting of the fundamental frequency and the higher harmonics, while ignoring the information contained in its transients (tVEP). We propose here to characterize both tVEP and ssVEP by chirplet time-frequency representation of VEP signal using a matching pursuit (MP) algorithm. Compared to the time-frequency analysis with short-time-Fourier-transform (STFT) and linear-prediction-coding (LPC) method, MP with chirplet shows not only clear characteristics of ssVEP, but a clear spindle-like time-frequency component of tVEP as well, which is not obvious in the other two methods.
In this paper, we study the issue of the high sampling rate audio modeling for lossless audio coding. We propose a cascade LMS structure to successfully model all high sampling rate audio signals. This cascade structu...
详细信息
In this paper, we study the issue of the high sampling rate audio modeling for lossless audio coding. We propose a cascade LMS structure to successfully model all high sampling rate audio signals. This cascade structure predictor, not only performs better than its counterpart FTR linear prediction coding (LPC) technique in modeling general audio signals, but also displays a faster convergence and smaller mean square error (MSE) than conventional LMS predictor and low-order stages cascade LMS predictor, while the complexity of the proposed predictor remains simple. The simulation results show that the proposed structure gets better prediction gain compared with Monkey's audio codec and MPEG-4 ALS codec provided by Technology University of Berlin (TUB) for real high sampling rate audio test set. Other adaption algorithms can be used for the single stages.
暂无评论