作者:
Härmä, AAalto Univ
Lab Acoust & Audio Signal Proc Espoo 02015 Finland
In conventional one-step forward linear prediction, an estimate for the current sample value is formed as a linear combination of previous sample values. In this paper, a generalized form of this scheme is studied. He...
详细信息
In conventional one-step forward linear prediction, an estimate for the current sample value is formed as a linear combination of previous sample values. In this paper, a generalized form of this scheme is studied. Here, the prediction is not based simply on the previous sample values but to the signal history as seen through an arbitrary filterbank. It is shown in the paper how the coefficients of a modified model can be obtained and how the inverse and synthesis filters can be implemented. Various properties of such systems are derived in this article. As an example, a novel linearpredictive system using inherently logarithmic frequency representation is introduced.
We explain a time complexity reduction algorithm that improves the line spectral frequencies (LSF) search procedure on the unit circle for low bit rate speech codecs. The algorithm is based on strong interframe correl...
详细信息
We explain a time complexity reduction algorithm that improves the line spectral frequencies (LSF) search procedure on the unit circle for low bit rate speech codecs. The algorithm is based on strong interframe correlation exhibited by LSFs. The fixed point C code of ITU-T Recommendation G.723.1, which uses the "real root algorithm" was modified and the results were verified on ARM-7TDMI general purpose RISC processor. The algorithm works for all test vectors provided by International Telecommunications Union-Telecommunication (ITU-T) as well as real speech. The average time reduction in the search computation was found to be approximately 20%.
The direct use of vector quantization (VQ) to encode LPC parameters in a communication system suffers from the following two limitations: 1) complexity of implementation for large vector dimensions and codebook sizes ...
详细信息
The direct use of vector quantization (VQ) to encode LPC parameters in a communication system suffers from the following two limitations: 1) complexity of implementation for large vector dimensions and codebook sizes and 2) sensitivity to errors in the received indices due to noise in the communication channel. In the past, these issues have been simultaneously addressed by designing channel matched multistage vector quantizers (CM-MSVQ). A sub-optimal sequential design procedure has been used to train the codebooks of the CM-MSVQ. In this paper, a novel channel-optimized multistage vector quantization (CO-MSVQ) codec is presented, in which the stage codebooks are jointly designed. The proposed codec uses a source and channel-dependent distortion measure to encode line spectral frequencies derived from segments of a speech signal. Extensive simulation results are provided to demonstrate the consistent reduction in both the mean and the variance of the spectral distortion obtained using the proposed codec relative to the conventional sequentially designed CM-MSVQ. Furthermore, the perceptual quality of the reconstructed speech using the proposed codec was found to be better than that obtained using the sequentially designed CM-MSVQ.
The performance of Support Vector Machines (SVMs) is highly dependent on the choice of a kernel function suited to the problem at hand. In particular, the kernel implicitly performs a feature selection which is the mo...
详细信息
ISBN:
(纸本)3540232400
The performance of Support Vector Machines (SVMs) is highly dependent on the choice of a kernel function suited to the problem at hand. In particular, the kernel implicitly performs a feature selection which is the most important stage in any texture classification algorithm. In this work a new Gabor filter based kernel for texture classification with SVMs is proposed. The proposed kernel function is based on a Gabor filter decomposition and exploiting linear predictive coding (LPC) in each subband, and exploiting a filter selection method to choose the best filters. The proposed texture classification method is evaluated using several texture samples, and compared with recently published methods. The comprehensive evaluation of the proposed method shows significant improvement in classification error rate.
Geveze software is one of many implementations in text-to-speech synthesis for various languages. The program is based on vocal tract modeling and compresses speech by the LPC method. During synthesis, for each letter...
详细信息
Geveze software is one of many implementations in text-to-speech synthesis for various languages. The program is based on vocal tract modeling and compresses speech by the LPC method. During synthesis, for each letter of a given word, the nearest combination of the letter sequences within the words used in training is searched and its parameters are used. As in other systems based on vocal tract modeling, a pulse train generates excitation for voiced sounds, while a noise signal is used for unvoiced sounds. The obtained signal is then amplified with a coefficient special to the sound at that instant and finally sent to an IIR filter, whose filter characteristics are determined by LPC coefficients, and the digitized waveform of the speech is obtained. During training, 10 LPC coefficients, 1 gain, and 1 period information bit are obtained for each 25 ms window, separated by 10 ms. During synthesis, these values change every 10 ms to the values of the following window. The digital signal at the output of the IIR filter is converted to analog, which has to be passed through a low pass filter (LPF) in order to smooth the transitions between windows. After filtering, the analog signal is ready to be amplified. Our objective is to design this system, already running on computer, as an integrated circuit and, if possible, to have a single chip solution with optimum cost and performance.
Our objective consists in studying collaborative situations where an introduction of a new agent into the system increases the performance of the group. This work is a part of the road traffic simulation model ARCHISI...
详细信息
Our objective consists in studying collaborative situations where an introduction of a new agent into the system increases the performance of the group. This work is a part of the road traffic simulation model ARCHISIM in which a model of the behavior of the drivers has already been developed and validated. Our idea is to re-use this structure to define a collaborative behavior. For this purpose, we add a coordination layer to the basic driver agent behavior. The use of a simple reactive coordination strategy called "situated coordination" allows the emergence of a coherent group of agents that coordinate their activities. Experiments earned to measure the performance of the group, each time a new agent is introduced, show satisfactory results. We also demonstrate that the agents succeeded in coordinating themselves even if the degree of the constraints imposed by the simulated environment is very high.
This paper describes a new technique, called the empirical mode decomposition (EMD) that has recently been pioneered by N. E. Huang and al., for adaptively representing nonstationary signals as sums of zero-mean AM-FM...
详细信息
This paper describes a new technique, called the empirical mode decomposition (EMD) that has recently been pioneered by N. E. Huang and al., for adaptively representing nonstationary signals as sums of zero-mean AM-FM components [N. E. Huang, et al., 1998]. The components, called intrinsic mode functions (IMFs), allow the analysis of frequency composition of one-dimensional signals. Applied to speech signal, the EMD allows us to study the different intrinsic oscillatory modes. Besides, computing the LPC analysis of each mode provides an estimation of formants. The presented method is firstly applied on a sum of pure frequency signals. Among different modes we can detect all frequencies taking a part of a signal.
Fault diagnosis and monitoring of the machines operation in the power plants play an important role in safety operation and maintenance of those operating machines. In this paper we propose the fault diagnosis algorit...
详细信息
Fault diagnosis and monitoring of the machines operation in the power plants play an important role in safety operation and maintenance of those operating machines. In this paper we propose the fault diagnosis algorithm using the LPC coefficients with sound acquisition system from the operating machines through the single LPC spectrum is possible.
In this paper, we propose a method for spoken digits recognition using dynamic programming (DP) matching combined with subspace decomposition, which linearly separates phonetic information from speech data based on pr...
详细信息
In this paper, we propose a method for spoken digits recognition using dynamic programming (DP) matching combined with subspace decomposition, which linearly separates phonetic information from speech data based on principal component analysis (PCA). This method is capable of more robust speech recognition of less reference speech patterns. The use of the spectral envelope by linear predictive coding (LPC) in speech recognition is unable to avoid errors in recognition due to the uncertainty of personalities, the dynamic variation of features, and so on. By using the subspace method, the proposed method eliminates these problems and enables good recognition results of less standard speech patterns. We use DP matching in recognizing, because it is capable of more efficient pattern matching by normalizing the length of vowels. Simulation results show that the proposed method, using projection onto phonetic subspace with less speaker information, is superior to the conventional method using spectral envelopes, which is obtained by LPC, and DP matching. Projection onto phonetic subspace is a kind of feature vector that contains less speaker information.
We report the results of test measurements aimed at determining the performances of /sup 6/Li doped glass scintillators for the detection of ultra-cold neutrons. Three types of scintillators, GS1, GS10 and GS20, which...
详细信息
We report the results of test measurements aimed at determining the performances of /sup 6/Li doped glass scintillators for the detection of ultra-cold neutrons. Three types of scintillators, GS1, GS10 and GS20, which differ by their /sup 6/Li concentrations, have been tested. The signal to background separation is fully acceptable. The relative detection efficiencies have been determined as a function of the neutron velocity. We find that GS10 has a higher efficiency than the others for the detection of neutrons with velocities below 6 m/s (i.e. energies smaller than 250 neV). Two pieces of scintillators have been irradiated with a high flux of cold neutrons to test the radiation hardness of the glasses. No reduction in the pulse height has been observed up to an absorbed dose of 10/sup 13/ n/cm/sup 3/.
暂无评论