In this paper, we describe a new approach to cope with packet loss in speech coders. The idea is to split the information present in each speech packet into two components, one to independently decode the given speech...
详细信息
ISBN:
(纸本)9781424442966
In this paper, we describe a new approach to cope with packet loss in speech coders. The idea is to split the information present in each speech packet into two components, one to independently decode the given speech frame and one to enhance it by exploiting interframe dependencies. The scheme is based on sparse linear prediction and a redefinition of the analysis-by-synthesis process. We present Mean Opinion Scores for the presented coder with different degrees of packet loss and show that it performs similarly to frame dependent coders for low packet loss probability and similarly to frame independent coders for high packet loss probability. We also present ideas on how to make the coder work synergistically with the channel loss estimate.
In this paper is presented our work concerning continuous speech recognition in a telephone numbers voice-dialing task realized by statistical modeling. The speech is parameterized using the computational inexpensive ...
详细信息
ISBN:
(纸本)0780379799
In this paper is presented our work concerning continuous speech recognition in a telephone numbers voice-dialing task realized by statistical modeling. The speech is parameterized using the computational inexpensive linear predictive coding (LPC) to determine the LPC, the cepstral LPC and the reflection coefficients. In our tests, the recognizer based on Hidden Markov Models (HMMs) performs better for the cepstral LPC coefficients then for the reflection coefficients or the LPC coefficients.
A large part of the latest research in speech coding algorithms is motivated by the need of obtaining secure military communications, to allow effective operation in a hostile environment. Since the bandwidth of the c...
详细信息
ISBN:
(纸本)9781467380324
A large part of the latest research in speech coding algorithms is motivated by the need of obtaining secure military communications, to allow effective operation in a hostile environment. Since the bandwidth of the communication channel is a sensitive problem in military applications, low bit-rate speech compression methods are mostly used. Several speech processing applications such as Mixed Excitation linear Prediction are characterized by very strict requirements in power consumption, size, and voltage supply. These requirements are difficult to fulfill, given the complexity and number of functions to be implemented, together with the real time requirement and large dynamic range of the input signals. To meet these constraints, careful optimization should be done at all levels, ranging from algorithmic level, through system and circuit architecture, to layout and design of the cell library. The key points of this optimization are among others, the choice of the algorithms, the modification of the algorithms to reduce computational complexity, the choice of a fixed-point arithmetic unit, the minimization of the number of bits required at every node of the algorithm, and a careful match between algorithms and architecture. This paper concentrates on low bit rate speech coding technology, mainly in MELP and solved the problem of optimizing the program of MELP on Digital Signal Processor platform. The algorithm was ported onto a fixed point DSP, Blackfin 537, and stage by stage optimization was performed to meet the real time requirements. The main functions involved were analysis, parameter encoding, parameter decoding and synthesis. The fixed point source code at the MELP front end was also thoroughly optimized at the C Level. Memory optimization techniques such as data placement and caching were also used to reduce the processing time. The results we obtained show that real-time implementations of a speech vocoder based on the MELP standard for low bit rate commu
Speech signal is the unique and special signal in communication system so it must be analyzed in order to extract its important parameters and to compress it for maximum utilization of available bandwidth. For this, t...
详细信息
ISBN:
(纸本)9781467370165
Speech signal is the unique and special signal in communication system so it must be analyzed in order to extract its important parameters and to compress it for maximum utilization of available bandwidth. For this, there are various kinds of speech analysis and synthesis techniques that have been effectively used. Among all these techniques, linear predictive coding (LPC) is the most powerful one to represent the speech signal at reduced bit rates while preserving the quality of the signal and also provides accurate estimation of speech parameters and is computationally effective. Voice-excited LPC is the technique proposed in this paper. This technique has been implemented using both male and female voices and trade-offs between bit rates, delay, power signal to noise ratio and complexity are analyzed. It results in a low bit rate and a better signal to noise ratio.
In this paper a biologically motivated approach for the English alphabet speech recognition is implemented by using a self-organized neural network. The designing of an accurate and effective speech recognition system...
详细信息
ISBN:
(纸本)9781479968961
In this paper a biologically motivated approach for the English alphabet speech recognition is implemented by using a self-organized neural network. The designing of an accurate and effective speech recognition system is a challenging task in the area of human computer interface. linear predictive coding (LPC) is used for learn Feature extraction of input audio signals. Back propagation (BP) is a feed forward neural network and it propagates the error in backward direction to update the weights of hidden layers. The error is difference of actual output and target output computed on the basis of gradient descent method. The performance of the system is evaluated on the basis of recognition rate. We have used BP neural network architecture to recognize the time varying input data. The proposed provides better accurate results than the existing systems for the English Alphabet speech recognition.
Processing human speech with the use of digital technologies leads to several important fields of research. Speech-to-text and lip-syncing are among the instances of relevant prominent research areas. In this regard, ...
详细信息
ISBN:
(纸本)9781728172064
Processing human speech with the use of digital technologies leads to several important fields of research. Speech-to-text and lip-syncing are among the instances of relevant prominent research areas. In this regard, audio-visualization of acoustic signals, providing visual aid in real-time for disabled people, and realization of text-free animation applications are just to name a few. Therefore, in this study, a language-independent lip-sync method that is based on extended linear predictive coding is proposed. The proposed method operates on baseband electrical signal that is acquired by a standard single-channel off-the-shelf microphone and exploits the statistical characteristics of acoustic signals produced by human speech. In addition, the proposed method is implemented on an embedded system, tested, and its performance is evaluated. Results are given along with discussions and future directions.
Experiments are described in coding broadband audio using multipulse linear predictive coding (LPC). It is possible to obtain stable LPC filters that model sinusoids closely and to include perceptual masking in these ...
详细信息
Experiments are described in coding broadband audio using multipulse linear predictive coding (LPC). It is possible to obtain stable LPC filters that model sinusoids closely and to include perceptual masking in these coders. The quantization of both the LPC and multipulse parameters is also examined, and it is found that multipulse can compensate for quantization error in LPC filters. With appropriate perceptual masking, these coders can provide high quality and audio output. At 128 kb/s, the coders achieved typical SNR values of 35-40 dB in simulations.< >
In this paper, an information theoretic study of properties of the speech spectrum process is performed. Various techniques to model the probability density function are applied to the spectrum source to compute rate-...
详细信息
ISBN:
(纸本)0780364163
In this paper, an information theoretic study of properties of the speech spectrum process is performed. Various techniques to model the probability density function are applied to the spectrum source to compute rate-distortion functions. We estimate the difference in the required rate to achieve a given distortion for three different scenarios: interframe gain exploitation, low-pass filtering of LPC vectors and increased speech signal bandwidth. We obtain fairly consistent results for the different methods of calculating rate-distortion functions. The results show that for close to transparent LPC quantization we, gain 4-6 bits per frame by exploiting first order interframe correlation. The new idea of using low-pass filtered LPC vectors has shown to decrease the coding cost with 1-3 bits per frame, depending on the cutoff frequency.
The main purpose of this paper is to perform the voice signal processing and synthesis to apply to service robots using DSP TMS320C6713. In this study, we select the C language program written with DSP;the voice signa...
详细信息
ISBN:
(纸本)9781479945849
The main purpose of this paper is to perform the voice signal processing and synthesis to apply to service robots using DSP TMS320C6713. In this study, we select the C language program written with DSP;the voice signal is restored to play. The speech synthesis is implemented using linear predictive coding (LPC) approach in the paper. Because the LPC synthesis is a coding technique of time waveform, one can reduce the transmission rate of signal in time domain;save perfect voice messages. It can be obtained a very high naturalness and clarity for the synthesis of a group of words. We scored the synthesis results by subjective listening test ways.
This paper investigates a normalized cross-correlation-based doubletalk detector for acoustic echo cancellers. A novel low-complexity version is presented suitable for implementation on IP-enabled telephones employing...
详细信息
ISBN:
(纸本)0780391543
This paper investigates a normalized cross-correlation-based doubletalk detector for acoustic echo cancellers. A novel low-complexity version is presented suitable for implementation on IP-enabled telephones employing LPC-based speech coders. In particular. the algorithm obtains a decorrelated input signal and decorrelation filter coefficients directly from the speech decoder. The proposed algorithm is implemented using ITU G.729. and simulation results are collected for a typical room. Calibration data are obtained and shown to be independent of the speech input signal. It is also shown that the proposed algorithm has the same performance as a full-complexity version employing the same decorrelation filter order.
暂无评论