检索结果-内蒙古大学图书馆

An efficient implementation of the forward and inverse MDCT in MPEG audio coding

IEEE SIGNAL PROCESSING LETTERS 2001年第2期8卷 48-51页

作者： Britanak, V Rao, KR Slovak Acad Sci Inst Control Theory & Robot Bratislava 84237 Slovakia Univ Texas Dept Elect Engn Arlington TX 76019 USA

The modified discrete cosine transform (MDCT) is employed in subband/transform coding schemes as the analysis/synthesis filter bank based on time domain aliasing cancellation (TDAC). The most efficient implementation of the forward and inverse MDCT computation for layer III in MPEG-1 and MPEC-2 international audio coding standards is proposed. It is based on a new fast algorithm for the forward and inverse MDCT computation in the oddly stacked system. The complete signal flow graphs for the implementation of MDCT and inverse MDCT in layer III are also provided.

关键词： audio coding modified discrete cosine transform (MDCT) MPEG

来源：评论

学校读者我要写书评

暂无评论

Scalable embedded zero tree wavelet packet audio coding 3

Scalable embedded zero tree wavelet packet audio coding

引用

3rd IEEE Workshop on Signal Processing Advances in Wireless Communications (SPAWC 01)

作者： Chang, PC Lin, JH Natl Cent Univ Dept Elect Engn Chungli 32045 Taiwan

ISBN: (纸本)0780367200

Multimedia transmission over Internet is getting popular and increasingly important;In particular, scalable coding is desirable for heterogeneous network with varies bandwidths. In this work, we propose a scalable embedded zero tree wavelet packet (Scalable EZWP) audio coding system that is a scalable audio compression system using wavelet packet decomposition and embedded zero-tree coding. We focus on multi-layer low bitrate coding which delivers high perceptual quality. In the base layer, the overlapped audio segment is first transformed by wavelet packet. Then the local significant coefficients are extracted, quantized, and coded by variable length coding. In the enhancement layer and the full band layer, the residual signal that is the difference between the original and the output of the previous layer is coded via EZW with psychoacoustic model and arithmetic coding. The target bit rates for three layers are 16, 32, and 64 1Kbps, respectively. The performance of the proposed coding system is only slightly inferior to MPEG-1 layer 3 at 64 Kbps while it provides bitrate scalability that is suitable for multimedia distribution over Internet heterogenous networks.

关键词： Arithmetic audio coding audio compression Bandwidth Bit rate Internet Nonhomogeneous media Psychoacoustic models Scalability Wavelet packets

来源：评论

学校读者我要写书评

暂无评论

Modulation frequency and efficient audio coding

Modulation frequency and efficient audio coding

引用

Conference on Advanced Signal Processing Algorithms, Architectures, and Implementations XI

作者： Atlas, LE Vinton, MS Univ Washington Dept Elect Engn Seattle WA 98195 USA

ISBN: (纸本)0819441880

The concept of "modulation frequency" is shown to be a valuable insight into time-frequency transforms for audio coding. A two-dimensional transform, where the second dimension approximately decomposes the audio signal into modulation frequencies, is proposed. This transform, when applied to audio coding, provides high quality at low data rates and adapt gracefully to changes in available bandwidth. It is inherently scalable, meaning that channel conditions can be matched without the need for additional computation. Moreover, it is compact: in subjective tests our algorithm, coded at 32kilobits/seconds/channel, outperformed MPEG-l Layer 3 (MP3) coded at 56 kilobits/seconds/channel (both at 44.1 kHz). This potentially useful result motivates the need for further insight into the definition and analysis of modulation frequency. We thus define modulation frequency for a simple narrowband signal, propose a general bilinear framework for detection, and then propose a minimal set of conditions to extend this definition to broadband signals such as audio.

关键词： time-frequency analysis modulation frequency audio coding audio compression scalable coding

来源：评论

学校读者我要写书评

暂无评论

Binaural cue coding - Part II: Schemes and applications

引用

IEEE TRANSACTIONS ON SPEECH AND audio PROCESSING 2003年第6期11卷 520-531页

作者： Faller, C Baumgarte, F Agere Syst Media Signal Proc Res Dept Allentown PA 18109 USA

Binaural Cue coding (BCC) is a method for multichannel spatial rendering based on one down-mixed audio channel and side information. The companion paper (Part I) covers the psychoacoustic fundamentals of this method and outlines principles for the design of BCC schemes. The BCC analysis and synthesis methods of Part I are motivated and presented in the framework of stereophonic audio coding. This paper, Part II, generalizes the basic BCC schemes presented in Part I. It includes BCC for multichannel signals and employs an enhanced set of perceptual spatial cues for BCC synthesis. A scheme for multichannel audio coding is presented. Moreover, a modified scheme is derived that allows flexible rendering of the spatial image at the receiver supporting dynamic control. All aspects of complete BCC encoder and decoder implementations are discussed, such as down-mixing of the input signals, low complexity estimation of the spatial cues, and quantization and coding of the-side information. Application examples are given and the performance of the coder implementations are evaluated and discussed based on subjective listening test results.

关键词： audio coding auralization binaural signal HRTF multichannel audio spatial image spatial rendering stereo audio surround sound

来源：评论

学校读者我要写书评

暂无评论

Cylindrical Antennas and Arrays [Book Review]

引用

IEEE Circuits and Devices Magazine 2004年第6期20卷 52-52页

作者： D. Torrungrueng Asian University of Science and Technology

来源：评论

学校读者我要写书评

暂无评论

audio coding using sorted sinusoidal parameters

Audio coding using sorted sinusoidal parameters

引用

IEEE International Symposium on Circuits and Systems (ISCAS)

作者： M. Raad I.S. Burnett School of Electrical Computer and Telecommunications Engineering University of Wollongong Wollongong NSW Australia

ISBN: (纸本)0780366859

This paper describes a new audio coding scheme based on sinusoidal coding of signals. Sinusoidal coding permits the representation of a given signal through the summation of sinusoids. The parameters of the sinusoids (the amplitudes, phases and frequencies) are transmitted to allow the signal reconstruction. In the proposed scheme, the sinusoidal parameters are sorted according to energy content and perceptual significance. The most significant parameters are transmitted first allowing the use of only a small set of the parameters for signal reconstruction. The proposed scheme incurs a low delay and uses a 20 ms frame length. Results show that the coder operating at a mean rate of 39 kb/s, performs favorably in comparison with the MPEG-4 coder at 42 kb/s.

关键词： audio coding Frequency Bit rate Signal reconstruction Sorting Fourier transforms Signal synthesis Australia Delay MPEG 4 Standard

来源：评论

学校读者我要写书评

暂无评论

Scalable audio coding based on hierarchical transform coding modules

引用

ELECTRONICS AND COMMUNICATIONS IN JAPAN PART III-FUNDAMENTAL ELECTRONIC SCIENCE 2001年第8期84卷 34-45页

作者： Jin, A Moriya, T Iwakami, N Miki, S NTT Cyber Space Labs Musashino Tokyo 1808585 Japan

We propose a method that hierarchically quantizes wideband Modified Discrete Cosine Transform (MDCT) coefficients by developing a module that has a transform coding method primarily for audio as the basic structural unit and freely using this module multiple times at the desired frequencies. The major feature of this method is to implement a simple structure having a high degree of freedom in scalable coding to hierarchically quantize MDCT coefficients over a wide band of frequencies by sharing the proposed module and using it multiple times. This paper presents examples using combinations of the module operating at a sampling frequency of 48 kHz and a bit rate of at least 8 kbit/s. In this example, a bit rate of at least 8 kbit/s and a reconstructed frequency band of at least 4 kHz can de selected as the objective. Subjective evaluation tests are performed to verify the effectiveness oft he proposed method. (C) 2001 Scripta Technica

关键词： audio coding transform coding TwinVQ coding scalable coding hierarchical quantization

来源：评论

学校读者我要写书评

暂无评论

The multimode transform predictive coding paradigm

引用

IEEE TRANSACTIONS ON SPEECH AND audio PROCESSING 2003年第2期11卷 117-129页

作者： Ramprashad, SA Media Signal Proc Res Agere Syst Berkeley Hts NJ 07922 USA

Presented is a new coding paradigm, Multimode Transform Predictive coding (MTPC), which combines speech and audio coding principles in a single coding structure. The paradigm is an adaptive coding paradigm which automatically adjusts how different coding modules are used based on the input signal. This allows MTPC coders to robustly handle a wider range of signals than single configuration (mode) Transform Predictive coding (TPC) designs. A wideband MTPC coder design targeting two-way communication applications and bitrates from 13 to 40 kbit/s is also presented. Subjective Absolute Category Rating test results on speech, speech in noise and music demonstrate that the performance at 16, 24 and 32 kbit/s meets or exceeds that of ITU-T Rec. G.722 at 48, 56 and 64 kbit/s respectively for many coding conditions. Subjective Reference-ABx (R-ABx) tests are also included to show the potential advantages of the multimode coder over a single mode TPC coder. Finally, possible improvements in the MTPC coder design,for applications such, as broadcasting, which are less sensitive to delay and encoder complexity, are discussed.

关键词： adaptive coder audio coding open-loop speech coding subjective test wideband

来源：评论

学校读者我要写书评

暂无评论

Binaural cue coding - Part I. Psychoacoustic fundamentals and design principles

引用

IEEE TRANSACTIONS ON SPEECH AND audio PROCESSING 2003年第6期11卷 509-519页

作者： Baumgarte, F Faller, C Agere Syst Media Signal Proc Res Dept Allentown PA 18109 USA

Binaural Cue coding (BCC) is a method for multichannel spatial rendering based on one down-mixed audio channel and BCC side information. The BCC side information has a low data rate and it is derived from the multichannel encoder input signal. A natural application,of BCC is multichannel audio data rate reduction since only a single down.-mixed audio channel needs to be transmitted. An alternative BCC scheme for efficient joint transmission of independent source signals supports flexible spatial rendering at the decoder. This paper (Part I) discusses the most relevant binaural perception phenomena exploited by BCC. Based on that, it presents a psychoacoustically motivated approach for designing a BCC analyzer and synthesizer. This leads to a reference implementation for analysis and synthesis of stereophonic audio signal's based on a Cochlear Filter Bank. BCC synthesizer implementations based on the FFT are presented as low-complexity alternatives. A subjective audio quality assessment of these implementations shows the robust performance of BCC for critical speech and audio material. Moreover, the results suggest that the performance given by the reference synthesizer is not significantly compromised when using a low-complexity FFT-based synthesizer. The companion paper (Part II) generalizes BCC analysis and synthesis for multichannel audio and proposes complete BCC schemes including quantization and coding. Part II also describes an alternative BCC scheme with flexible rendering capability at the decoder and proposes several applications for both BCC schemes.

关键词： audio coding auditory filter bank auditory scene synthesis binaural source localization coding of binaural spatial cues spatial rendering

来源：评论

学校读者我要写书评

暂无评论

A hierarchical lossless/lossy coding system for high quality audio up to 192 kHz sampling 24 bit format

A hierarchical lossless/lossy coding system for high quality...

引用

IEEE International Conference on Consumer Electronics (ICCE)

作者： A. Jin T. Moriya K. Ikeda D.T. Yang NTT Cyber Space Laboratories NTT Corporation Japan

We present a flexible audio coding system for use in the unbroken transmission of high-quality (192 kHz sampling rate, 24-bit digitization (max.)) stereo-audio data streams, either live or recorded. The system provides both lossless and variable-level lossy quality; quality is selectable to suit a full range of wide- and narrow-band IP networks. As input signals for transmission at the server PC, less than nine PCM sound files in different formals are simultaneously encodable, and the efficiency of simultaneous compression is very high. The system is realized by software that runs on a typical PC. This makes everyone able to transmit high-quality sound anywhere within the IP networks. An MPEG-4 audio codec (TwinVQ or AAC) provides the core of the lossy-coding module.

关键词： Sampling methods IP networks audio coding Streaming media Narrowband File servers Network servers Phase change materials MPEG 4 Standard Codecs

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：