检索结果-内蒙古大学图书馆

International Conference on Manufacturing Science and Technology (ICMST 2011)

作者： Xiao, Qiang Chen, Liang Geng, Chao PLA Univ Sci & Tech Inst Commun Engn Nanjing Jiangsu Peoples R China PLA Ruian Unit 92985 Ruian Peoples R China

ISBN: (纸本)9783037852958

This paper presents a low bit rate speech coder based on predictive lattice vector quantization (PLVQ) and time-scale modification (TSM). The coding model of proposed vocoder is built on the MELP, in which bit rate reduction is achieved by taking advantage of PLVQ and TSM techniques. PLVQ is used to encode the speech line spectrum pair (LSP) parameters, which has the advantage of lower implementation complexity than multi-stage vector quantization (MSVQ), moreover, it does not require memory for codebook storage. With our speech data base, PLVQ can save up to 4 bits/frame compared to unstructured codebook MSVQ. TSM can change the speed of speech signal with its perceptual characteristics remained. Through appending TSM as previous and post process, speech coding at bit rate about 1.1 kbps could be easily achieved without modifying the vocoder structure.

关键词： low bit rate speech coding Lattice vector quantization Time-scale modification LSP parameters

来源：评论

学校读者我要写书评

暂无评论

Split matrix quantization of LPC parameters

引用

IEEE TRANSACTIONS ON speech AND AUDIO PROCESSING 1999年第2期7卷 113-125页

作者： Xydeas, CS Papanastasiou, C Univ Manchester Manchester Sch Engn Elect Engn Div Speech Proc Res Lab Manchester M13 9PL Lancs England

This paper examines in detail the design issues and performance characteristics of linear predictive coding (LPC) split matrix quantization (SMQ), This efficient LPC quantization method which was recently proposed by the authors [1] can be viewed as an extension of the conventional split vector quantization (SVQ) process. SMQ removes existing interframe/intraframe line spectral frequency (LSF) redundancy by applying VQ principles on trajectories of smoothly evolving, with time, LSF coefficients. Using a 20 ms LPC analysis frame size, "transparent'' quantization is achieved at 900 b/s, whereas "high quality" LSF quantization is easily obtained at 650 b/s, Furthermore, the SMQ methodology offers valuable flexibility in the way quantization of LPC coefficients is performed and leads into several schemes of varying computational complexity/storage characteristics.

关键词： low bit rate speech coding LPC quantization spectral quantization

来源：评论

学校读者我要写书评

暂无评论

Differential coding of speech LSF parameters using hybrid vector quantization and bidirectional prediction

引用

IEEE TRANSACTIONS ON speech AND AUDIO PROCESSING 2000年第2期8卷 208-211页

作者： da Silva, LM Alcaim, A Univ Brasilia Dept Elect Engn BR-22453900 Rio De Janeiro Brazil

This correspondence presents a new strategy to encode the LP short-time spectral envelope (stse) of speech. A better reconstruction of the stse is achieved by modifying the usual trade-off between the transmission rate of LP parameters and the performance of the quantization algorithm. A differential coding based on bidirectional prediction and hybrid vector quantization is used to compensate the increase in transmission rate. Simulation results show the effectiveness of this coding strategy.

关键词： low bit rate speech coding linear predictive coding (LPC) quantization spectral quantization

来源：评论

学校读者我要写书评

暂无评论

WARP-Q: QUALITY PREDICTION FOR GENERATIVE NEURAL speech CODECS

WARP-Q: QUALITY PREDICTION FOR GENERATIVE NEURAL SPEECH CODE...

引用

IEEE International Conference on Acoustics, speech and Signal Processing (ICASSP)

作者： Jassim, Wissam A. Skoglund, Jan Chinen, Michael Hines, Andrew Univ Coll Dublin Sch Comp Sci Dublin Ireland Google Chrome Media San Francisco CA USA

ISBN: (纸本)9781728176055

Good speech quality has been achieved using waveform matching and parametric reconstruction coders. Recently developed very low bit rate generative codecs can reconstruct high quality wideband speech with bit streams less than 3 kb/s. These codecs use a DNN with parametric input to synthesise high quality speech outputs. Existing objective speech quality models (e.g., POLQA, ViSQOL) do not accurately predict the quality of coded speech from these generative models underestimating quality due to signal differences not highlighted in subjective listening tests. We present WARP-Q, a full-reference objective speech quality metric that uses dynamic time warping cost for MFCC speech representations. It is robust to small perceptual signal changes. Evaluation using waveform matching, parametric and generative neural vocoder based codecs as well as channel and environmental noise shows that WARP-Q has better correlation and codec quality ranking for novel codecs compared to traditional metrics in addition to versatility for general quality assessment scenarios.

关键词： Dynamic time warping low bit rate speech coding LPCNet WaveNet speech quality

来源：评论

学校读者我要写书评

暂无评论

Improvement of 2.4kbps LPC10 algorithm based on LSF parameters 24

Improvement of 2.4kbps LPC10 algorithm based on LSF paramete...

引用

6th International Symposium on Signal Processing Systems (SSPS)

作者： Yu, Xin You, Xingyuan Yang, Ming Liu, Xiaoling Wuhan Maritime Commun Res Inst Wuhan Peoples R China

ISBN: (纸本)9798400716171

Short-wave communication has unstable channel conditions, but it is still widely used in military and diplomatic fields due to its security, high destruction resistance and full coverage characteristics. Therefore, in order to adapt to transmission in shortwave channels, further research on low rate voice coding algorithms under low reliable channel conditions is needed. In this paper, we propose a coding algorithm that converts linear prediction coefficients into line spectral pairs of frequency parameters, and on the basis of the traditional 2.4kbps LPC10 algorithm, we reduce the coding rate to 1.7kbps and score the synthesized results by the PESQ algorithm, and the results show that the improved algorithm yields a voice score of 2.1870, which is an increase of 0.9315 compared to the LCP10 algorithm, with a significant improvement in voice quality. The voice quality is greatly improved.

关键词： LPC 10 Line spectrum frequency PESQ low bit rate speech coding speech synthesis

来源：评论

学校读者我要写书评

暂无评论

Phonological vocoding using artificial neural networks 40

Phonological vocoding using artificial neural networks

引用

40th IEEE International Conference on Acoustics, speech, and Signal Processing, ICASSP 2015

作者： Cernak, Milos Potard, Blaise Garner, Philip N. Idiap Research Institute Martigny Switzerland

ISBN: (纸本)9781467369978

We investigate a vocoder based on artificial neural networks using a phonological speech representation. speech decomposition is based on the phonological encoders, realised as neural network classifiers, that are trained for a particular language. The speech reconstruction process involves using a Deep Neural Network (DNN) to map phonological features posteriors to speech parameters -line spectra and glottal signal parameters -followed by LPC resynthesis. This DNN is trained on a target voice without transcriptions, in a semi-supervised manner. Both encoder and decoder are based on neural networks and thus the vocoding is achieved using a simple fast forward pass. An experiment with French vocoding and a target male voice trained on 21 hour long audio book is presented. An application of the phonological vocoder to low bit rate speech coding is shown, where transmitted phonological posteriors are pruned and quantized. The vocoder with scalar quantization operates at 1 kbps, with potential for lower bit-rate. © 2015 IEEE.

关键词： low bit rate speech coding Parametric vocoding phonology

来源：评论

学校读者我要写书评

暂无评论

Phonological vocoding using artificial neural networks

Phonological vocoding using artificial neural networks

引用

IEEE International Conference on Acoustics, speech and Signal Processing

作者： M. Cernak B. Potard P. N. Garner Idiap Res. Inst. Martigny Switzerland

ISBN: (纸本)9781467369985

We investigate a vocoder based on artificial neural networks using a phonological speech representation. speech decomposition is based on the phonological encoders, realised as neural network classifiers, that are trained for a particular language. The speech reconstruction process involves using a Deep Neural Network (DNN) to map phonological features posteriors to speech parameters - line spectra and glottal signal parameters - followed by LPC resynthesis. This DNN is trained on a target voice without transcriptions, in a semi-supervised manner. Both encoder and decoder are based on neural networks and thus the vocoding is achieved using a simple fast forward pass. An experiment with French vocoding and a target male voice trained on 21 hour long audio book is presented. An application of the phonological vocoder to low bit rate speech coding is shown, where transmitted phonological posteriors are pruned and quantized. The vocoder with scalar quantization operates at 1 kbps, with potential for lower bit-rate.

关键词： Parametric vocoding low bit rate speech coding phonology phonology Vocoding Vocoders Artificial neural networks bit rate scalar quantization ENCODER Speaking Neural network Spectral lines signal parameter Genetic Transcription speech decoders

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：