Isolated speech recognition system is an important step in many applications such as automated banking system, catalogue dialing, automated data entry, robotics etc. Selection of feature and classifier in the speech r...
详细信息
ISBN:
(纸本)9781467373494
Isolated speech recognition system is an important step in many applications such as automated banking system, catalogue dialing, automated data entry, robotics etc. Selection of feature and classifier in the speech recognition system is based on the complexity and recognition accuracy. Mel-frequency cepstral coefficients (MFCCs), line spectral frequencies (LSF), short time energy (STE) and linearprediction coefficients (LPC) are the features used in the existing speech recognition systems. In this paper, a sparse feature, obtained from the optimization of linearprediction coefficients (LPC) with a sparsity constraint is used for the classification. These sparse linearprediction coefficients (sparse LPC) offer a more effective way of representing the voiced speech. Artificial neural network (ANN) is used for the classification purpose. Experimental results show that the proposed method is noise robust and its performance exceeds LPC and MFCC feature based speech recognition systems.
In this paper, we propose a Packet Loss Concealment (PLC) method. Our method is based on Pitch Waveform Replication (PWR) and linear Predictive coding (LPC). The estimated packet using LPC is better than that using PW...
详细信息
ISBN:
(纸本)9781479952304
In this paper, we propose a Packet Loss Concealment (PLC) method. Our method is based on Pitch Waveform Replication (PWR) and linear Predictive coding (LPC). The estimated packet using LPC is better than that using PWR at a near boundary of the lost packet and the received one. On the other hand, the estimated packet using PWR is better than that using LPC at a distant boundary. Therefore, we combine the two estimated packets by considering the merits and demerits of PWR and LPC. Experimental results show that the proposed method provides better Perceptual Evaluation of Speech Quality (PESQ) scores than the conventional methods. Especially PESQ scores of the proposed method are remarkably excellent in the case of male voice.
Human computer interaction is concerned in the way Users (humans) interact with the computers. Some users can interact with the computer using the traditional methods of a keyboard and mouse as the main input devices ...
详细信息
ISBN:
(纸本)0387446397
Human computer interaction is concerned in the way Users (humans) interact with the computers. Some users can interact with the computer using the traditional methods of a keyboard and mouse as the main input devices and the monitor as the main output device. Due to one or another reason, some users are enable to interact with machines using a mouse and keyboard device, hence there is need for special devices. If we use computer for more time it is really difficult to sit on the chair, keeping hands continuously on the keyboard or mouse and keep watching the monitor. To relax our body and interact comfortably with Computer, we need some special device or method, so that computer Will understand and accept commands without keyboard or by clicking mouse. Speech Recognition System helps users who are unable to use traditional Input and Output (I/O) devices. Since four decades, man has been dreaming for an "intelligent machine" which can master the natural speech. In its simplest form, this machine should consist of two subsystems, namely Automatic Speech Recognition (ASR) and Speech Understanding (SU). The goal of ASR is to transcribe natural speech while SU is to understand the meaning of the transcription. Recognising and understanding a spoken sentence is obviously a knowledge-intensive process, which must take into account all variable information about the speech communication process, from acoustics to semantics and pragmatics.
As the hidden Markov model(HMM) has a strong ability of time sequence modeling,the continuous Gaussian mixture HMM is used to establish a model base of the rolling bearing *** adaptive particle swarm optimization(APSO...
详细信息
ISBN:
(纸本)9781510819092
As the hidden Markov model(HMM) has a strong ability of time sequence modeling,the continuous Gaussian mixture HMM is used to establish a model base of the rolling bearing *** adaptive particle swarm optimization(APSO) with extremum disturbed operator and dynamic change of inertia weights is introduced to the traditional training algorithm for solving the local extremum *** vibration signal is collected for extracting 12 order LPC coefficients as a feature vector through the dispose of adding *** the given feature vector,the HMM is built for bearing fault condition monitoring and fault ***,different fault conditions experiment are carried out on the motor bearing *** experiment result shows that the method can use a small amount of samples for training HMM,and it is more effective and has higher classification accuracy in fault diagnosis compared with the traditional training algorithm.
This paper proposes a gesture recognition method which uses higher order local autocorrelation (HLAC) features extracted from PARCOR images. To extract dominant information from a sequence of images, we apply linear p...
详细信息
This paper proposes a gesture recognition method which uses higher order local autocorrelation (HLAC) features extracted from PARCOR images. To extract dominant information from a sequence of images, we apply linear prediction coding technique to the sequence of pixel intensities and PARCOR images are constructed from the PARCOR coefficients of the sequences of the pixel values. From the PARCOR images, HLAC features are extracted and the sequences of the features are used as the input vectors of the Hidden Markov Model (HMM) based recognizer. Since HLAC features are inherently shift-invariant and computationally inexpensive, the proposed method becomes robust to changes in the person's position and makes real-time gesture recognition possible. Experimental results of gesture recognition are shown to evaluate the performance of the proposed method.
This thesis presents a new robust filtering technique that suppresses impulsive noise in speech signals. The method makes use of Projection Statistics based on medians to detect segments of speech with impulses. The a...
详细信息
This thesis presents a new robust filtering technique that suppresses impulsive noise in speech signals. The method makes use of Projection Statistics based on medians to detect segments of speech with impulses. The autoregressive model employed to smooth out the speech signal is identified by means of a robust nonlinear estimator known as the Schweppe-type Huber GM-estimator. Simulation results are presented that demonstrate the effectiveness of the filter. Another contribution of the work is the development of a robust version of the Kalman filter based on the Huber M-estimator. The performances of this filter are evaluated for a simple autoregressive process.
A preprocessing scheme based on linearprediction coefficient (LPC) residual is applied to higher-order statistics (HOSs) for automatic assessment of an overall pathological voice quality. The normalized skewness and ...
详细信息
A preprocessing scheme based on linearprediction coefficient (LPC) residual is applied to higher-order statistics (HOSs) for automatic assessment of an overall pathological voice quality. The normalized skewness and kurtosis are estimated from the LPC residual and show statistically meaningful distributions to characterize the pathological voice quality. 83 voice samples of the sustained vowel /a/ phonation are used in this study and are independently assessed by a speech and language therapist ( SALT) according to the grade of the severity of dysphonia of GRBAS scale. These are used to train and test classification and regression tree ( CART). The best result is obtained using an optima l decision tree implemented by a combination of the normalized skewness and kurtosis, with an accuracy of 92.9%. It is concluded that the method can be used as an assessment tool, providing a valuable aid to the SALT during clinical evaluation of an overall pathological voice quality. Copyright (C) 2009 J. Lee and M. Hahn. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
The application of spectral analysis of heart sound signals using linear prediction coding to determine the exact resonant frequencies corresponding to heart valves vibration is discussed. The resonant frequency of a ...
详细信息
The application of spectral analysis of heart sound signals using linear prediction coding to determine the exact resonant frequencies corresponding to heart valves vibration is discussed. The resonant frequency of a cardiac valvular structure is an important parameter in determining the in-vivo constitutive properties and normal-pathogenic states of heart valves and ventricular myocardium. A peak augmentation technique for the spectral phonocardiagram is also demonstrated to obtain improved quality of resonance. The analysis is confined mainly to the second heart sound. As a result of such studies, important diagnostic information can be obtained, thereby enhancing the clinical significance of spectral phonocardiagraphy.
In this paper bootstrap resampling techniques are applied to assess speech quality and thereby evaluate performance of distinct speech enhancement algorithms, under the assumption that the speech segments can be appro...
详细信息
ISBN:
(纸本)9781424442959
In this paper bootstrap resampling techniques are applied to assess speech quality and thereby evaluate performance of distinct speech enhancement algorithms, under the assumption that the speech segments can be approximated by an autoregressive model. A bootstrap-based multiple hypotheses testing procedure is constructed to test a distance measure based on linear predictive coding, which is the log-likelihood ratio distance. It is shown that the multiple hypotheses test results correlate well with conventional numerical distance measures, which suggests the applicability of the proposed procedure in assessment of speech quality as well as speech enhancement algorithms.
Acoustic phonetics is the study of the physical properties of sounds and provides means to distinguish one sound from another in quality and quantity. A study of acoustic characteristics of Kannada begins with the pho...
详细信息
ISBN:
(数字)9781538624401
ISBN:
(纸本)9781538624418
Acoustic phonetics is the study of the physical properties of sounds and provides means to distinguish one sound from another in quality and quantity. A study of acoustic characteristics of Kannada begins with the phonemic analysis of the language. Phonetic analysis of Kannada vowels is presented in this paper. The analysis of speech signal based on formant space provides a method of assessing the influence of each formant on a phoneme across gender and different age groups. PRAAT software is used for the purpose of analysis of speech signals. In this work Kannada vowels speech signals were recorded from different age groups of both male and female. Formant frequencies of corresponding vowels were computed. The analysis is carried out separately for male and female speakers. The preliminary analysis of formants of vowels show significant variations across gender and age groups. In the similar way using the linear Predictive coding (LPC) analysis is done to get in depth understanding of formants by considering different filter orders. Then order of the LPC filter is typically estimated by using information about the formants obtained using PRAAT tool.
暂无评论