检索结果-内蒙古大学图书馆

Wavelet analysis used in text-to-speech synthesis

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-ANALOG AND DIGITAL SIGNAL PROCESSING 1998年第8期45卷 1125-1129页

作者： Kobayashi, M Sakamoto, M Saito, T Hashimoto, Y Nishimura, M Suzuki, K IBM Japan Ltd Tokyo Res Lab Yamato Kanagawa 2428502 Japan

This brief describes the use of wavelet analysis in the development of a Japanese text-to-speech (TTS) system for personal computers. The quality of synthesized speech is one of the most important features of any TTS system. Synthesis methods which are based on manipulation of the speech signal spectrum (e.g., linear predictive coding synthesis and formant synthesis) produce comprehensible but unnatural sounding output. The lack of naturalness commonly associated with these methods results from the use of oversimplified speech models, small synthesis unit inventories, and poor handling of text parsing for prosody control. We developed four new technologies to overcome these difficulties and improve the quality of output from TTS systems: accurate pitch mark determination by wavelet analysis, speech waveform generation using a modified time domain pitch synchronous overlap-add method, speech synthesis unit selection using a context dependent clustering method, and efficient prosody control using a 3-phrase parser. All four technologies will be described;however, those which rely on wavelet techniques will be emphasized.

关键词： Wavelet analysis Speech synthesis Signal synthesis Control system synthesis Speech analysis Microcomputers Speech coding linear predictive coding Wavelet domain Synchronous generators

来源：评论

学校读者我要写书评

暂无评论

From LPC to normalised autocorrelation coefficients through a matrix

引用

ELECTRONICS LETTERS 1998年第4期34卷 333-334页

作者： Sanches, I Univ Sao Paulo Escola Politecn Dept Eng Eletron Lab Proc Sinais & Sistemas BR-05508900 Sao Paulo SP Brazil

A matrix method for converting linear prediction coefficients (LPC), or autoregressive coefficients (ARC), to their corresponding normalised autocorrelation coefficients (NAC) is presented. The matrix is an alternative to the usual step-down procedure to be used in conjunction with the Levinson algorithm when conversion from LPC to NAC is necessary.

关键词： LPC autoregressive coefficients correlation theory linear prediction coefficients linear predictive coding matrix algebra matrix method normalised autocorrelation coefficients

来源：评论

学校读者我要写书评

暂无评论

Scalar quantisation of LSF parameters using vector measurement

引用

ELECTRONICS LETTERS 1998年第10期34卷 961-963页

作者： Ng, HC Leung, SH Tsang, CW City Univ Hong Kong Dept Elect Engn Kowloon Hong Kong

A new form of line spectral frequency (FSF), bounded line spectral frequency, is presented. It is shown that the new representation is more efficient than the direct line spectral frequency and the differential line spectral frequency (DLSF). By using a vector measure, the scalar quantisation of tenth-order linear predictive coding (LPC) parameters can be coded at 28 bit/frame with a transparent quantisation quality.

关键词： linear predictive coding parameters Speech processing techniques scalar quantisation linear predictive coding transparent quantisation quality vector measurement LSF parameters quantisation (signal) speech coding Codes Speech and audio signal processing Signal processing and detection Information theory vectors bounded line spectral frequency tenth-order LPC parameters

来源：评论

学校读者我要写书评

暂无评论

Speech recognition method using quantised LSP parameters in CELP-type coders

引用

ELECTRONICS LETTERS 1998年第2期34卷 156-157页

作者： Choi, SH Kim, HK Lee, HS Gray, RM Korea Adv Inst Sci & Technol Dept Informat & Commun Engn Dongdaemun Gu Seoul 130012 South Korea Samsung Adv Inst Technol Informat Proc Sector Human & Comp Interact Lab Yongin 449712 Kyungki Do South Korea Stanford Univ Dept Elect Engn Informat Syst Lab Stanford CA 94305 USA

A low-complexity speech recognition method applicable to digital communication networks is proposed. A feature set suitable for speech recognition is obtained from quantised LSP parameters in CELP-type coders without reconstructing the speech signals. The authors present the effects of the speech coder on speaker-independent recognition performance. and show that the recognition accuracy of the proposed method is better than that of the recogniser using reconstructed speech signals.

关键词： Speech processing techniques Speech recognition low-complexity recognition method linear predictive coding CELP-type coders digital communication digital communication networks feature set speaker-independent recognition performance Codes speech coding line spectral pair Speech and audio signal processing speech recognition recognition accuracy quantised LSP parameters

来源：评论

学校读者我要写书评

暂无评论

A subspace approach to estimation of autoregressive parameters from noisy measurements

引用

IEEE TRANSACTIONS ON SIGNAL PROCESSING 1998年第2期46卷 531-534页

作者： Davila, CE So Methodist Univ Dept Elect Engn Dallas TX 75275 USA

This correspondence describes a method for estimating the parameters of an autoregressive (AR) process from a finite number of noisy measurements, The method uses a modified set of Yule-Walker (YW) equations that lead to a quadratic eigenvalue problem that, when solved, gives estimates of the AR parameters and the measurement noise variance.

关键词： Parameter estimation Biomedical measurements Equations Noise measurement Autocorrelation Random processes linear predictive coding White noise Additive noise Eigenvalues and eigenfunctions

来源：评论

学校读者我要写书评

暂无评论

Source-filter models for time-scale pitch-scale modification of speech

Source-filter models for time-scale pitch-scale modification...

引用

IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 98)

作者： Acero, A Microsoft Res Redmond WA 98052 USA

ISBN: (纸本)0780344286

This paper presents two time-scale pitch-scale modification techniques to be used in speech synthesis systems. They have been applied to Microsoft's Whistler system, which is based on concatenative synthesis. Both methods are based on a source-filter model, one of them using LPC parameters and the other one using cepstral parameters. The proposed methods achieve high quality prosody modification, retain the characteristics of the donor speaker, allow for spectral manipulation (to reduce spectral discontinuities at unit boundaries), yield compact acoustic inventories and improved voiced fricatives.

关键词： Speech synthesis Filters linear predictive coding Pulse generation Smoothing methods Cepstral analysis Loudspeakers Man machine systems Synthesizers Speech processing

来源：评论

学校读者我要写书评

暂无评论

How steady are vowel steady-states?

引用

CLINICAL LINGUISTICS & PHONETICS 1998年第5期12卷 405-415页

作者： Blomgren, M Robb, M Univ Connecticut Dept Commun Sci Storrs CT 06269 USA

The duration of vowel steady-states (VSS) was examined acoustically in the speech production of 40 normal young adults. VSS was assessed according to formant frequency changes in sustained /i/ productions and consonant + /i/ + /d/(/Cid/) productions. The duration of the VSS was measured for the first and second formants (F1 and F2) by incorporating a fixed rate-of-change criterion. Results indicated no significant differences in VSS duration according to gender or vowel context. VSS duration based on F1 was significantly longer than F2 VSS duration. The duration of VSS was also found to be correlated to the overall vowel duration in /Cid/ contexts. Discussion focuses on the analysis and application of VSS in acoustic studies of normal and disordered speech production.

关键词： acoustics vowel steady-state formant frequency linear predictive coding

来源：评论

学校读者我要写书评

暂无评论

Spectral weighting of SBCOR for noise robust speech recognition

Spectral weighting of SBCOR for noise robust speech recognit...

引用

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： S. Kajita K. Takeda F. Itakura Graduate School of Engineering University of Nagoya Nagoya Japan

Subband-autocorrelation (SBCOR) analysis is a noise robust acoustic analysis based on filter bank and autocorrelation analysis, and aims to extract the periodicities associated with the inverse of the center frequency in a subband. In this paper, it is derived that SBCOR results in the lateral inhibitive weighting (LIW) processing of the power spectrum, and it is shown that the LIW is significantly effective for noise robust acoustic analysis using a DTW word recognizer. An interpretation of the LIW is also described. A flattening technique of the noise spectral envelope using an LPC inverse filter is applied to speech degraded with noise, and DTW word recognition is performed. The idea of this inverse filtering technique comes from weakening the strong periodic components included in noise. The experimental results using a 32th order LPC inverse filter show that the recognition performance of SBCOR (or LIW) is improved for computer room noise.

关键词： Noise robustness Speech recognition Acoustic noise linear predictive coding Filter bank Autocorrelation Frequency Speech enhancement Degradation Filtering

来源：评论

学校读者我要写书评

暂无评论

A 1.7 kb/s MELP coder with improved analysis and quantization

A 1.7 kb/s MELP coder with improved analysis and quantizatio...

引用

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： A. McCree J.C. De Martin DSPS Research and Development Texas Instruments Inc. Dallas TX USA Texas Instruments Inc Dallas TX US

This paper describes our new mixed excitation linear predictive (MELP) coder designed for very low bit rate applications. This new coder, through algorithmic improvements and enhanced quantization techniques, produces better speech quality at 1.7 kb/s than the new U.S. Federal Standard MELP coder at 2.4 kb/s. Key features of the coder are an improved pitch estimation algorithm and a line spectral frequencies (LSF) quantization scheme that requires only 21 bits per frame. With channel coding, this new MELP coder is capable of maintaining good speech quality even in severely degraded channels, at a total bit rate of only 3 kb/s.

关键词： Quantization Filters Speech analysis Code standards Synthesizers Bit rate Channel coding Background noise Frequency estimation linear predictive coding

来源：评论

学校读者我要写书评

暂无评论

Formant frequency estimation using a Mel-scale LPC algorithm

Formant frequency estimation using a Mel-scale LPC algorithm

引用

International Symposium on Telecommunications

作者： A.M. De Lima Araujo F. Violaro ETFPa and DECOM-FEEC State University of Campinas-UNICAMP Sao Paulo Brazil DECOM-FEEC State University of Campinas-UNICAMP Sao Paulo Brazil

This paper presents an algorithm for F1 and F2 formant estimation. The proposed algorithm combines a linear predictive analysis together with the Mel psychoacoustical perceptual scale. The algorithm was tested for the first 2 formants and produced good performance for male and female speakers, adults and children. In contrast to the classical LPC algorithm which requires variable-order prediction filters to take into account different formant patterns, the proposed algorithm is capable of extracting these formants with a fixed-order prediction filter.

关键词： Frequency estimation linear predictive coding Discrete Fourier transforms Psychology Filters Speech analysis Algorithm design and analysis Resonance Resonant frequency Psychoacoustic models

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：