The paper examines the performance of several versions of the parallel processing method of pitch period estimation of speech, highlighting the limitations of each. An improved algorithm, based on the temporal investi...
详细信息
The paper examines the performance of several versions of the parallel processing method of pitch period estimation of speech, highlighting the limitations of each. An improved algorithm, based on the temporal investigation of the speech waveform, is described and evaluated. It is shown to offer improved performance in both long term pitch period estimation accuracy, and cycle to cycle accuracy. A new method of evaluating the performance of pitch detection algorithms, based on measuring their stability with respect to variation of the time origin of the input speech, is also described.
pitch period is the important parameters of speech recognition and speech synthesis. pitch period detection has been focus in the field of audio processing research. Traditional AMDF-based algorithm and its improved v...
详细信息
pitch period is the important parameters of speech recognition and speech synthesis. pitch period detection has been focus in the field of audio processing research. Traditional AMDF-based algorithm and its improved version, LV-AMDF-based algorithm easily leads to the double error or half error, and so on in the pitchdetection. To solve these problems, AMDF, LV-AMDF function characteristics and shortcomings in pitchdetection are analyzed, the parameters compensation AMDF pitch detection algorithm is proposed in this article to reduce semi-frequency, double-frequency errors often appear in the pitchdetection and improve the detection accuracy. Experimental results show that its pitchdetection accuracy is better than AMDF and LV-AMDF.
Determining multiple pitches in noisy and reverberant speech is an important and challenging task. We propose a robust multipitch tracking algorithm in the presence of both background noise and room reverberation. A n...
详细信息
ISBN:
(纸本)9781424442966
Determining multiple pitches in noisy and reverberant speech is an important and challenging task. We propose a robust multipitch tracking algorithm in the presence of both background noise and room reverberation. A new channel selection method is utilized in conjunction with an auditory front-end to extract periodicity features in the time-frequency space. These features are combined to formulate frame level conditional probabilities given each pitch state. A hidden Markov model is then applied to integrate these probabilities and search for the most likely pitch state sequences. The proposed approach can reliably detect up to two simultaneous pitch contours in noisy and reverberant conditions. Quantitative evaluations show that our system significantly outperforms existing ones, particularly in reverberant environments.
Previous work demonstrated the1/f nature of speech residual and proposed a narrowband to wideband speech conversion scheme using this property [6]. This thesis proposes three major improvements of the processing schem...
详细信息
Previous work demonstrated the1/f nature of speech residual and proposed a narrowband to wideband speech conversion scheme using this property [6]. This thesis proposes three major improvements of the processing scheme. The residual excited linear predictive (RELP) model is used in this thesis. In this method, the speech is pre-emphasized before linear predictive analysis is performed. The linear prediction coefficients are used to construct the inverse filter for residual extraction. In order to explore the1/f nature of speech for coding and quality enhancement, it is necessary to have good estimation of the spectrum attenuation trend. Thus pre-emphasis filters that can be tuned to different segments of speech are needed. In this thesis, nearly1/f pre-emphasis filters are generated using Park-McClellan method. The proposed processing scheme provides better estimation of the scaling exponent for quality enhancement. The residual of speech is wavelet decomposed for analysis. The pattern of the wavelet coefficients consists of deterministic and random components. In this thesis, deterministic component are separately coded from the random component and reconstruction yield better result. The speech is segmented prior to the analysis. It is well known that since speech is not stationary, an arbitrary segmentation may include both silence and speech, and voiced and unvoiced transition in a given frame. The pitch information can be used to delineate the boundaries between the voiced and unvoiced segments. In this thesis, A pitch detection algorithm that combines the Average Magnitude Difference Function and Sum of Cumulants is proposed. This algorithm outperforms the existing methods in the presence of additive Gaussian noise.
pitch period is the important parameters of speech recognition and speech synthesis. pitch period detection has been focus in the field of audio processing research. Traditional AMDF-based algorithm and its improv...
详细信息
ISBN:
(纸本)9781424438631;9781424438624
pitch period is the important parameters of speech recognition and speech synthesis. pitch period detection has been focus in the field of audio processing research. Traditional AMDF-based algorithm and its improved version, LV-AMDF-based algorithm easily leads to the double error or half error, and so on in the pitchdetection. To solve these problems, AMDF, LV-AMDF function characteristics and shortcomings in pitchdetection are analyzed, the parameters compensation AMDF pitchdetectionalgorithm is proposed in this article to reduce semifrequency, double-frequency errors often appear in the pitchdetection and improve the detection accuracy. Experimental results show that its pitchdetection accuracy is better than AMDF and L V-AMDF.
Determining multiple pitches in noisy and reverberant speech is an important and challenging task. We propose a robust multipitch tracking algorithm in the presence of both background noise and room reverberation. A n...
详细信息
ISBN:
(纸本)9781424442959
Determining multiple pitches in noisy and reverberant speech is an important and challenging task. We propose a robust multipitch tracking algorithm in the presence of both background noise and room reverberation. A new channel selection method is utilized in conjunction with an auditory front-end to extract periodicity features in the time-frequency space. These features are combined to formulate frame level conditional probabilities given each pitch state. A hidden Markov model is then applied to integrate these probabilities and search for the most likely pitch state sequences. The proposed approach can reliably detect up to two simultaneous pitch contours in noisy and reverberant conditions. Quantitative evaluations show that our system significantly outperforms existing ones, particularly in reverberant environments.
暂无评论