检索结果-内蒙古大学图书馆

Companded quantization of speech MDCT coefficients

IEEE TRANSACTIONS ON speech AND AUDIO PROCESSING 2005年第2期13卷 163-173页

作者： Nordén, F Hedelin, P Aalborg Univ Dept Commun Technol DK-9220 Aalborg Denmark Chalmers Univ Technol Informat Theory Lab S-41296 Gothenburg Sweden

Here, we propose speech-coding procedures achieving high subjective quality, avoiding speech-specific processing and interframe exploitation. Thus, the scheme is tractable for packet-based voice communication, and has the capability of coding generic audio. The architecture is based on an modified discrete cosine transform (MDCT) representation of the signal, and combines efficient vector quantization (VQ) techniques with psychoacoustic principles. Weighted quantization of MDCT coefficients is performed, using a codebook based on a statistical model of the multidimensional NEXT pdf. The weighting and the codebook are adapted for each frame to account for masking thresholds given by a psychoacoustic analysis. Actual quantization is performed using lattices, thereby, achieving close to rate independent complexity. The result is a coding scheme operational at a range of rates. Here, a particular instance at 16 kbits/s, using a sampling frequency of 8 kHz, is shown to perform better than an LD-CELP operating at the same rate, even though no interframe memory is exploited.

关键词： audio coding modified discrete cosine transform (MDCT) psycho acoustics speech-coding statistical modeling vector quantization (VQ)

来源：评论

学校读者我要写书评

暂无评论

Channel and source considerations of a MELP derived vocoder operating at reduced bit rates

引用

DIGITAL SIGNAL PROCESSING 2003年第4期13卷 623-635页

作者： Ilk, HG Tugaç, S Ankara Univ Dept Elect Engn TR-06100 Ankara Turkey

Robust low bit rate speech coders are essential in commercial and military communication systems. They operate at fixed bit rates and those bit rates cannot be altered without major modifications in the vocoder design. In this paper we introduce a scaled speech coder, which operates on time-scale modified input speech. The proposed method offers any bit rate from 2400 b/s to downwards without modifying the principle vocoder structure, which is the mixed excitation linear prediction (MELP) vocoder. We consider the application of transmitting MELP-encoded speech over noisy communication channels after time scale compression is applied. Computer simulation results, both source and channel, are presented in terms of objective speech quality metrics and informal subjective listening tests. A statistical tool called bootstrap is also used to determine the accuracy of these test results. Design parameters such as codec complexity and delay are also investigated. (C) 2003 Elsevier Inc. All rights reserved.

关键词： speech-coding channel characterization and simulation low bit rate vocoders time scale modification of speech

来源：评论

学校读者我要写书评

暂无评论

Wavelets based on splines: An application

Wavelets based on splines: An application

引用

4th Conference on Wavelet Applications in Signal and Image Processing

作者： Srinivasan, P Jamieson, LH PURDUE UNIV SCH ELECT & COMP ENGNW LAFAYETTEIN 47907

ISBN: (纸本)0819422134

In this paper, we describe the theory and implementation of a variable rate speech coder using the cubic spline wavelet decomposition. In the discrete time wavelet extrema representation, Cvetkovic, et. al. implement an iterative projection algorithm to reconstruct the wavelet decomposition from the extrema representation. Based on this model, prior to this work, we have described a technique for speech coding using the extrema representation which suggests that the non-decimated extrema representation allows us to exploit the pitch redundancy in speech. A drawback of the above scheme is the audible perceptual distortion due to the iterative algorithm which fails to converge on some speech frames. This paper attempts to alleviate the problem by showing that for a particular class of wavelets that implements the ladder of spaces consisting of the splines, the iterative algorithm can be replaced by an interpolation procedure. Conditions under which the interpolation reconstructs the transform exactly are identified. One of the advantages of the extrema representation is the 'denoising' effect. A least squares technique to reconstruct the signal is constructed. The effectiveness of the scheme in reproducing significant details of the speech signal is illustrated using an example.

关键词： wavelets splines bi-orthogonal speech-coding

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：