Speech signal is the unique and special signal in communication system so it must be analyzed in order to extract its important parameters and to compress it for maximum utilization of available bandwidth. For this, t...
详细信息
ISBN:
(纸本)9781467370165
Speech signal is the unique and special signal in communication system so it must be analyzed in order to extract its important parameters and to compress it for maximum utilization of available bandwidth. For this, there are various kinds of speech analysis and synthesis techniques that have been effectively used. Among all these techniques, linear predictive coding (LPC) is the most powerful one to represent the speech signal at reduced bit rates while preserving the quality of the signal and also provides accurate estimation of speech parameters and is computationally effective. Voice-excited LPC is the technique proposed in this paper. This technique has been implemented using both male and female voices and trade-offs between bit rates, delay, power signal to noise ratio and complexity are analyzed. It results in a low bit rate and a better signal to noise ratio.
Speech Enhancement refers to the improvement in the intelligibility and or the quality of the degraded speech signal using signal processing techniques. Till recent days speech enhancement is a very difficult problem ...
详细信息
ISBN:
(纸本)9781479961085
Speech Enhancement refers to the improvement in the intelligibility and or the quality of the degraded speech signal using signal processing techniques. Till recent days speech enhancement is a very difficult problem because the noise content in the speech signals varies its nature and characteristics with time and application to application. Using speech enhancement techniques the quality and intelligibility of a speech signal can't be preserved simultaneously. So generally a trade off is maintained between these two. In speech communication there are number of applications where speech enhancement is required for Example: VoIP, hands free communication, hearing aids, answering machines, speech recognition, teleconferencing systems, car and mobile phones In this work the main focus is on the development of speech enhancement algorithm that maintains a proper tradeoff between quality and intelligibility in the speech signal. This can be made possible using the time and spectral information in the speech signal. This work also focus on the problem of enhancing the compressed version of the speech signal, to improve the intelligibility of the speech signal. The performance measures like Signal to Noise Ratio (SNR), Mean opinion Score (MOS), Pitch and Formants used to find the performance of a speech enhancement algorithm which varies from application to application.
In this paper, lung sounds recorded by electronic auscultation, were classified as healthy and pathological. linear predictive coding coefficients, Mel Frequency Cepstrum Coefficients mean and standard deviation were ...
详细信息
ISBN:
(纸本)9781467373869
In this paper, lung sounds recorded by electronic auscultation, were classified as healthy and pathological. linear predictive coding coefficients, Mel Frequency Cepstrum Coefficients mean and standard deviation were used as features. Records are consists of two data sets containing different respiratory cycle, k-Nearest Neighbor classification algorithm used and the performance obtained were discussed in the conclusion to the case of using different data sets and different attributes.
Tree based context clustering processes reduce the sizes of acoustic models of Hidden Markov Model (HMM) speech synthesis systems as well as eliminate problems arising from unseen sound units. Representations of speec...
详细信息
ISBN:
(纸本)9781479979615
Tree based context clustering processes reduce the sizes of acoustic models of Hidden Markov Model (HMM) speech synthesis systems as well as eliminate problems arising from unseen sound units. Representations of speech units in speech synthesis systems are often LPC or MCEP features whose characteristics promote speech reconstruction rather than discrimination among different sound units. In this paper, MFCC features, successfully utilized in speech recognition, were selected as features for generating context clustering trees applied to LPC/MCEP-based speech synthesis. On average, the collective size of acoustic models was 29% smaller than ones of typical cases while spectral features generated from a speech synthesis system using each type of clustering trees did not significantly deviate from features extracted from actual spoken utterances. Applying MFCC-based clustering tree did not significantly affect the resulting pitch and duration models of the system. We concluded that MFCC-based clustering tree can reduce the overall size of acoustic models while synthetic sound quality is maintained.
This paper addresses the problem of improving the intelligibility of the synthesized speech in Tamil text-to-speech (TTS) synthesis system. The human speech is artificially generated by speech synthesis. The normal la...
详细信息
ISBN:
(纸本)9788132221265;9788132221258
This paper addresses the problem of improving the intelligibility of the synthesized speech in Tamil text-to-speech (TTS) synthesis system. The human speech is artificially generated by speech synthesis. The normal language text will be automatically converted into speech using TTS system. This paper deals with a corpus-driven Tamil TTS system based on the concatenative synthesis approach. Concatenative speech synthesis involves the concatenation of the basic units to synthesize an intelligent, natural sounding speech. In this paper, syllables are the basic unit of speech synthesis database and the modification of syllable pitch by timescale modification. The speech units are annotated with associated prosodic information about each unit, manually or automatically, based on an algorithm. An annotated speech corpus utilizes the clustering technique that provides way to select the suitable unit for concatenation, depending on the minimum total joint cost of the speech unit. The entered text file is analyzed first, this syllabication is performed based on the linguistics rules, and the syllables are stored separately. Then, the syllable corresponding speech file is concatenated and the silence present in the concatenated speech is removed. After that, discontinuities are minimized at syllable boundaries without degrading the quality. Smoothing at the concatenated syllable boundary is performed, changing the syllable pitches by timescale modification.
This paper describes development of reliable gunshot detection system with emphasis on low power consumption for use in counter-poacher devices primarily protecting elephants in Africa. Intended system will work as a ...
详细信息
ISBN:
(纸本)9781479981175
This paper describes development of reliable gunshot detection system with emphasis on low power consumption for use in counter-poacher devices primarily protecting elephants in Africa. Intended system will work as a binary detector of gunfire without further classification of used fire-arm. Dominance of right gunshot detection over false alarms is crucial. Proposed recognition system is based on linearpredictive coefficients, correlation against template and comparison of spectral energy in sub-bands.
Compressed sensing is a new paradigm to explore the sparse nature of the signals. Compressed sensing allows to acquire signals fundamentally below the uniform rate digitization followed by compression usually used for...
详细信息
ISBN:
(纸本)9781467373494
Compressed sensing is a new paradigm to explore the sparse nature of the signals. Compressed sensing allows to acquire signals fundamentally below the uniform rate digitization followed by compression usually used for storage and transmission. linear predictive coding is the core of most of the speech coding algorithms. In this paper a novel method is presented to explore the sparse nature of the predictive parameters. Compressed sensing is applied on to the prediction parameters. Parameter estimation is posed as a 0-norm minimization problem. Alternative representations of the linearpredictive parameters, i.e. the reflection coefficients and the line spectral frequencies are also considered and the performance is analysed using the spectral distortion measurement. The number of reflection coefficients and line spectral frequencies used for encoding can be reduced using compressed sensing.
Envelope models are common in speech and audio processing: for example, linear prediction is used for modeling the spectral envelope of speech, whereas audio coders use scale factor bands for perceptual masking models...
详细信息
ISBN:
(纸本)9780992862633
Envelope models are common in speech and audio processing: for example, linear prediction is used for modeling the spectral envelope of speech, whereas audio coders use scale factor bands for perceptual masking models. In this work we introduce an envelope model called distribution quantizer (DQ), with the objective of combining the accuracy of linear prediction and the flexibility of scale factor bands. We evaluate the performance of envelope models with respect to their ability to reduce entropy as well as their correlation to the original signal magnitude. The experiments show that in terms of entropy, distribution quantization and linear prediction arc comparable, whereas for correlation, distribution quantization is better. Furthermore the coefficients of distribution quantization are independent and thus more flexible and easier to quantize than linearpredictive coefficients.
This paper develops a technique for identifying dynamic loads acting on a structure based on impulse response of the structure, also referred to as the system Markov parameters, and structure response measured at opti...
详细信息
This paper develops a technique for identifying dynamic loads acting on a structure based on impulse response of the structure, also referred to as the system Markov parameters, and structure response measured at optimally placed sensors on the structure. Inverse Markov parameters are computed from the forward Markov parameters using a linear prediction algorithm and have the roles of input and output reversed. The applied loads are then reconstructed by convolving the inverse Markov parameters with the system response to the loads measured at optimal locations on the structure. The structure essentially acts as its own load transducer. It has been noted that the computation of inverse Markov parameters, like most other inverse problems, is ill-conditioned which causes their convolution with the measured response to become quite sensitive to errors in system modeling and response measurements. The computation of inverse Markov parameters (and thereby the quality of load estimates) depends on the locations of sensors on the structure. To ensure that the computation of inverse Markov parameters is well-conditioned, a solution approach, based on the construction of D-optimal designs, is presented to determine the optimal sensor locations such that precise load estimates are obtained.
Wireless Transmission of a speech signal with low Bit Error Rate is one of the major challenges in Communication Systems particularly in cellular environment. In this paper a new technique to evaluate the performance ...
详细信息
Wireless Transmission of a speech signal with low Bit Error Rate is one of the major challenges in Communication Systems particularly in cellular environment. In this paper a new technique to evaluate the performance of coded speech at a low data rates using linear predictive coding (LPC) and Quadrature Phase Shift Keying (QPSK) modulation has been realized and simulated. A test recorded speech signal is compressed using linear predictive coding and transmitted using QPSK modulation. We compare the performance of a speech signal compressed with LPC with that compressed with LPC and modulated using QPSK. It is found that the performance of the speech signal with LPC and QPSK modulation outperforms the other technique. The performance has been evaluated using Mean Square Error (MSE), Peak Signal to Noise Ratio (PSNR) Compression Ratio (CR) and Bit Error Rate (BER). The proposed scheme will serve as a useful tool for simulating the transmission of a compressed speech signal.
暂无评论