检索结果-内蒙古大学图书馆

STATISTICAL TESTS AND DISTANCE MEASURES FOR LPC COEFFICIENTS

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING 1977年第6期25卷 554-559页

作者： DESOUZA, P UNIV OTAGO SCH MED DEPT PREVENT & SOCIAL MED DUNEDIN NEW ZEALAND

This paper considers the problem of comparing two sets of (LPC) coefficients or, more generally, that of comparing two short segments of speech via LPC techniques. It is shown that Itakura's prediction-residual ratio is intuitively unsatisfactory and theoretically misleading as a distance measure. Two slower, but more accurate statistical means of comparison are suggested, and these are supported by evidence from a simulation study.

关键词： linear predictive coding Statistical analysis Speech analysis Acoustic testing Terminology Speech coding Speech recognition Computer science Autocorrelation

来源：评论

学校读者我要写书评

暂无评论

Speech Enhancement and Recognition of Compressed Speech Signal in Noisy Reverberant Conditions

Speech Enhancement and Recognition of Compressed Speech Sign...

引用

1st International Conference on Information Systems Design and Intelligent Applications (INDIA 2012)

作者： Suman, Maloji Khan, Habibulla Latha, M. Madhavi Kumari, Devarakonda Aruna CSI Guntur Andhra Pradesh India KLU Guntur Andhra Pradesh India JNTU Coll Engn Hyderabad Andhra Pradesh India

ISBN: (纸本)9783642274428

Speech compression, enhancement and recognition in noisy, reverberant conditions is a challenging task. In this paper a new approach to this problem, which is developed in the framework of probabilistic random modeling, speech coding techniques are commonly used in low bit rate analysis and synthesis. coding algorithms seek to minimize the bit rate in the digital representation of a signal without an objectionable loss of signal quality in the process. Speech enhancement aims to improve speech quality by using various algorithms This paper deals with multistage vector quantization technique used for coding of narrow band speech signals. The parameter used for coding of speech signals are the line spectral frequencies, so as to ensure filter stability after quantization. A new approach incorporates the information about statistical random nature of uncompressed speech signal using LBG algorithm. The code books used for quantization are generated by using Linde, Buzo and Gray(LBG) algorithm. Speech model is characterized by LPC coefficients and parameterized by the coefficients of the reverberation filters The results of the multistage vector quantizer are compared with unconstrained vector quantization Technique. The performance of quantization is measured in terms of spectral distortion measured in dB, Computational complexity measured in KFlops and Memory Requirements measured in Floats. From the results it can be proved that multistage vector quantization is having better spectral distortion performance, less computational complexity and memory requirements when compared to unconstrained vector quantization. The proposed approach yields significantly estimating the parameters from the data, better performance in both signal to noise ratio and subjective filter methods

关键词： linear predictive coding Multi stage vector quantization Line Spectral Frequencies (LSF)

来源：评论

学校读者我要写书评

暂无评论

Source-filter models for time-scale pitch-scale modification of speech

Source-filter models for time-scale pitch-scale modification...

引用

IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 98)

作者： Acero, A Microsoft Res Redmond WA 98052 USA

ISBN: (纸本)0780344286

This paper presents two time-scale pitch-scale modification techniques to be used in speech synthesis systems. They have been applied to Microsoft's Whistler system, which is based on concatenative synthesis. Both methods are based on a source-filter model, one of them using LPC parameters and the other one using cepstral parameters. The proposed methods achieve high quality prosody modification, retain the characteristics of the donor speaker, allow for spectral manipulation (to reduce spectral discontinuities at unit boundaries), yield compact acoustic inventories and improved voiced fricatives.

关键词： Speech synthesis Filters linear predictive coding Pulse generation Smoothing methods Cepstral analysis Loudspeakers Man machine systems Synthesizers Speech processing

来源：评论

学校读者我要写书评

暂无评论

EFFICIENT AND SCALABLE NEURAL RESIDUAL WAVEFORM coding WITH COLLABORATIVE QUANTIZATION

EFFICIENT AND SCALABLE NEURAL RESIDUAL WAVEFORM CODING WITH ...

引用

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： Zhen, Kai Lee, Mi Suk Sung, Jongmo Beack, Seungkwon Kim, Minje Indiana Univ Luddy Sch Informat Comp & Engn Bloomington IN 47405 USA Indiana Univ Cognit Sci Program Bloomington IN USA Elect & Telecommun Res Inst Daejeon South Korea

ISBN: (纸本)9781509066315

Scalability and efficiency are desired in neural speech codecs, which supports a wide range of bitrates for applications on various devices. We propose a collaborative quantization (CQ) scheme to jointly learn the codebook of LPC coefficients and the corresponding residuals. CQ does not simply shoehorn LPC to a neural network, but bridges the computational capacity of advanced neural network models and traditional, yet efficient and domain-specific digital signal processing methods in an integrated manner. We demonstrate that CQ achieves much higher quality than its predecessor at 9 kbps with even lower model complexity. We also show that CQ can scale up to 24 kbps where it outperforms AMR-WB and Opus. As a neural waveform codec, CQ models are with less than 1 million parameters, significantly less than many other generative models.

关键词： Speech coding linear predictive coding deep neural network residual learning model complexity

来源：评论

学校读者我要写书评

暂无评论

Exploiting simultaneously masked linear prediction in a WI speech coder

Exploiting simultaneously masked linear prediction in a WI s...

引用

7th IEEE Workshop on Speech coding

作者： Lukasiak, J Burnett, IS Univ Wollongong Whisper Labs TITR Wollongong NSW 2522 Australia

ISBN: (纸本)0780364163

This paper uses a method of incorporating simultaneous masking into the calculation of a linear predictive filter (SMLPC) as the front end to a 2kbps waveform interpolation (WI) speech coder. A modification to the masking threshold calculation used in SMLPC is proposed. This modification improves the performance of SMLPC in noise like sections by placing greater emphasis on strongly voiced speech. MOS test results reveal that the modified SMLPC improved the perceptual quality of the WI coder. The improvement is significant for female speakers whilst the quality for male speech is virtually unchanged. This result conflicts with previous results reported for SMLPC where only male speech was improved. The change is attributed to the modification of the masking threshold and confirms that adapting the masking threshold according to the pitch of the speech will allow SMLPC to remove more perceptually important information from all input speech than standard LPC.

关键词： Autocorrelation Filtering Frequency Interpolation Laboratories linear predictive coding Masking threshold Nonlinear filters Redundancy Speech coding

来源：评论

学校读者我要写书评

暂无评论

Estimation of short term prediction parameters under lossy conditions 3

Estimation of short term prediction parameters under lossy c...

引用

3rd IEEE International Symposium on Signal Processing and Information Technology

作者： Black, D Sandler, M Univ London Dept EE DSP & Multimedia Grp London WC1E 7HU England

ISBN: (纸本)0780382927

The inconsistencies inherent in packet switched network delivery can be seriously detrimental to the quality of a real-time speech transmission. This paper places its emphasis on the importance of the short term prediction (STP) filter parameters as these are perceptually important to intelligible speech. We introduces several novel schemes for the recovery of lost STP parameters represented as line spectral frequencies (LSFs) based on extrapolation and interpolation techniques. The unique inclusion of a number of past and/or future frames further commends this work. Methods which out-perform traditional frame repetition and linear interpolation in terms of accuracy are presented and evaluated.

关键词： Digital signal processing Extrapolation Filters Frequency Interpolation linear predictive coding Packet switching Phase change materials Pulse modulation Speech coding

来源：评论

学校读者我要写书评

暂无评论

Forward and backward LP ELS-based time-varying complex speech analysis based on output error method 3

Forward and backward LP ELS-based time-varying complex speec...

引用

3rd IEEE International Symposium on Signal Processing and Information Technology

作者： Funaki, K Univ Ryukyus Comp & Networking Ctr Okinawa 9030213 Japan

ISBN: (纸本)0780382927

We have already proposed the ELS-based time-varying complex AR (TV-CAR) speech analysis based on forward LP as well as forward and backward LP in which the equation error is modeled by an AR model to whiten the error. The methods are based on an equation error method and can estimate unbiased speech spectrum due to the whitened equation error. It can be considered that these speech analysis methods may be suitable for a front-end of robust speech recognition and packet loss concealment on VoIP. This paper presents output error based ELS TV-CAR speech analysis algorithm and compares the performance with the equation error based method.

关键词： Equations Filters Least squares methods linear predictive coding Mean square error methods Robustness Signal analysis Speech analysis Speech processing Speech synthesis

来源：评论

学校读者我要写书评

暂无评论

A new algorithm for calculating LSP parameters of speech signal

A new algorithm for calculating LSP parameters of speech sig...

引用

8th International Conference on Signal Processing

作者： Li, Juanjuan Yu, Yibiao Rui, Xianyi Soochow Univ Sch Elect & Informat Engn Suzhou 215021 Peoples R China

ISBN: (纸本)9780780397361

Line Spectrum Pairs (LSP) representation of linear predictive coding coefficients is widely used in speech coding, speech recognition and other domains due to its desirable interpolation and quantization properties. Several methods proposed for calculating LSP parameters have been complicated by high computation complexity. This paper proposed an effective and efficient algorithm APF using Aitken iterative method and polynomial synthesis division. LSP parameters were estimated by obtaining a root of N-order nonlinear equation by Aitken iterative method at first, then decreasing degrees with polynomial, synthesis division, and finally calculating quartic equation using Ferrari solution. Theoretic analysis and experiment results show that the proposed algorithm has not only high precision but also low calculation complexity.

关键词： line spectrum pair linear predictive coding speech coding

来源：评论

学校读者我要写书评

暂无评论

The perceptual quality of MELP speech over error tolerant IP networks

The perceptual quality of MELP speech over error tolerant IP...

引用

33rd IEEE International Conference on Acoustics, Speech and Signal Processing

作者： Gavula, Ben Scheets, George Teague, Keith Weber, Justin Oklahoma State Univ Sch Elect & Comp Engn Stillwater OK 74078 USA

ISBN: (纸本)9781424414833

Modifications to IP based packet network protocols are examined that would make the network tolerant of bit errors in packet payloads or headers. These modifications are tested with communication quality MELP voice traffic. As measured by a PESQ score, improvements in the perceptual quality of the speech are noted that are maximized when error checking is disabled for the entire packet.

关键词： vocoders transport protocols internet error analysis linear predictive coding

来源：评论

学校读者我要写书评

暂无评论

A NEW EXCITATION MODEL FOR LPC VOCODER AT 2.4 KB/S

A NEW EXCITATION MODEL FOR LPC VOCODER AT 2.4 KB/S

引用

INTERNATIONAL CONF ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING

作者： ZHANG, XW CHEN, XZ Nanjing Institute of Communications Engineering Nanjing China

ISBN: (纸本)0780305329

A novel excitation model called the multicategory vector excitation (MCVE) model for a linear predictive coding (LPC) vocoder at 2.4 kb/s is proposed. In this model, speech signal is classified into four categories: unvoiced, voiced, onset, and offset. For every category of speech, an excitation codebook is available. Different excitation codebooks hold different characteristics. The analysis-by-synthesis procedure is used to select the excitation vectors. The computer simulation has been carried out, and the results show that the vocoder with the new excitation model is capable of synthesizing more intelligible and more natural speech at 2.4 kb/s.

关键词： unvoiced excitation voiced excitation onset excitation offset excitation speech coding excitation model LPC vocoder multicategory vector excitation linear predictive coding speech signal excitation codebook analysis-by-synthesis procedure excitation vectors 2.4 kbit/s linear predictive coding speech coding vocoders

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：