检索结果-内蒙古大学图书馆

audio coding using variable-depth multistage quantization

IEEE TRANSACTIONS ON SPEECH AND audio PROCESSING 1998年第2期6卷 186-189页

作者： Kossentini, F Macon, M Smith, MJT Univ British Columbia Dept Elect & Comp Engn Vancouver BC V6T 1Z4 Canada

An algorithm for high-quality coding of 48 kHz sampled audio signals is presented. The algorithm employs a perceptual transform and a variable-depth multistage quantizer. The resulting audio reproduction quality is better than that of the Motion Pictures Expert Group (MPEG) layer I coder and roughly equivalent to that of the MPEG layer II coder.

关键词： audio coding Quantization Frequency Psychoacoustic models Masking threshold Motion pictures Transform coding Auditory system Signal analysis Smoothing methods

来源：评论

学校读者我要写书评

暂无评论

Fast algorithm for computing in MPEG the forward and inverse MDCT audio coding

引用

SIGNAL PROCESSING 2006年第5期86卷 1055-1060页

作者： Truong, TK Chen, PD Cheng, TC I Shou Univ Dept Informat Engn Coll Elect & Informat Engn Ta Hsu Hsiang 84008 Kaohsiung Cty Taiwan Univ So Calif Dept Elect Engn Electrophys Los Angeles CA 90089 USA

The modified discrete cosine transform (MDCT) is always employed in transform-coding schemes as the analysis/ synthesis filter bank. In this paper, an efficient algorithm for MDCT and inverse MDCT (IMDCT) computation for MPEG-1 audio layer III and MPEG-2 international audio-coding standards is proposed, using only the type-II DCT. Finally, the proposed algorithm is compared to the similar algorithms in this paper. (C) 2005 Elsevier B.V. All rights reserved.

关键词： MDCT/IMDCT audio coding type-II DCT MPEG

来源：评论

学校读者我要写书评

暂无评论

Compression of Higher-Order Ambisonic Signals Using Directional audio coding

引用

IEEE-ACM TRANSACTIONS ON audio SPEECH AND LANGUAGE PROCESSING 2024年 32卷 651-665页

作者： Hold, Christoph Pulkki, Ville Politis, Archontis Mccormack, Leo Aalto Univ Dept Informat & Commun Engn Acoust Lab Espoo 02150 Finland Tampere Univ Fac Informat Technol & Commun Sci Tampere 33100 Finland

Delivering high-quality spatial audio in the Ambisonics format requires extensive data bandwidth, which may render it inaccessible for many low-bandwidth applications. Existing widely-available multi-channel audio compression codecs are not designed to consider the characteristic inter-channel relations inherent to the Ambisonics format, and thus may not leverage this knowledge to optimise the compression. Therefore, this article proposes a spatial audio compression algorithm, based on a novel reformulation of the Higher-Order Directional audio coding (HO-DirAC) method, which is specifically intended for compressing higher-order Ambisonic audio streams. The methodology builds upon the concept of a spherical filter bank acting in the spherical harmonic domain. This results in directionally constrained sound-field estimates and parameterization, which may be utilized to reconstruct the input Ambisonic signals with minimal perceived loss of quality. The results of a listening experiment indicate high perceptual quality when using six or more audio transport channels to deliver fifth-order (36 channels) Ambisonic sound scenes. The proposed formulation is also designed with low computational complexity in mind and may therefore be well suited for compressing Ambisonic sound scenes for a wide range of applications.

关键词： Ambisonics spatial audio audio coding

来源：评论

学校读者我要写书评

暂无评论

Real-time MPEG-1 audio coding aad decoding on a DSP chip

引用

IEEE TRANSACTIONS ON CONSUMER ELECTRONICS 1997年第1期43卷 40-47页

作者： Murphy, CD Anandakumar, K Moore School of Electrical Engineering University of Pennsylvania Philadelphia PA USA

The MPEG-1 audio standard (ISO/IEC 11172-3) establishes guidelines for the compression of high-quality digital audio signals [I]. The standard dictates the function of an encoder/decoder pair (codec), leaving form intentionally vague to allow for competing implementations. A typical approach to real-time operation is to design an application-specific integrated circuit (ASIC) dedicated to encoding, decoding [2], or both. We present an alternative codec that makes use of the general-purpose digital signal processing (DSP) chips that are now common in multimedia-capable workstations and personal computers. We discuss how selective optimization of codec structure allows robust performance using limited resources, highlight some of the problems inherent in translating the abstractions of the standard into assembly code, and point towards further investigations of real-time implementations of communications standards.

关键词： audio coding Decoding Digital signal processing chips Codecs Transform coding IEC standards ISO standards Code standards Application specific integrated circuits Communication standards

来源：评论

学校读者我要写书评

暂无评论

DRA audio coding Standard

引用

Chinese Journal of Electronics 2014年第3期23卷 521-526页

作者： MA Wenhua XU Jing MA Yuanzhe YOU Yuli Department of Computers Cisco School of InformaticsGuangdong University of Foreign Studies Department of Science and Technology Guangdong Rising Assets Management Co. Department of Biomedical Science and Technology South China University of Technology Guangdong Provincial Key Laboratary for Digital Audio Technology

DRA(Dynamic resolution adaptation)audio coding standard was shown to deploy transientlocalized MDCT to effectively suppress pre-echo artifacts and statistic allocation of codebooks to improve the compression efficiency of Huffman coding. Its quantizers and Huffman codebooks are designed in such a way that a signal path of 24 bits is provided throughout the codec so that high audio quality can be delivered if bit rate *** simple, it delivers state-of-the-arts compression efficiency as shown by five rounds of ITU-R BS.11116 compliant subjective listening tests.

关键词： audio coding Standard Listening test Modified discrete cosine transform(MDCT) Huffman coding

来源：评论

学校读者我要写书评

暂无评论

Representations of the Complex-Valued Frequency-Domain LPC for audio coding

引用

IEEE SIGNAL PROCESSING LETTERS 2024年 31卷 361-365页

作者： Jo, Byeongho Beack, Seungkwon ETRI Daejeon 34129 South Korea

For decades, linear predictive (LP) analysis of the real-valued time-domain signals has been developed for speech coding, analysis, and synthesis. Recently, the complex-valued frequency-domain LP coding was developed to enhance the estimation performance of the temporal envelope. To apply it for audio coding, a suitable representation for the complex-valued frequency-domain LP coefficients (CLPC) is required before quantization, but there was no efficient way to represent it. To address the problem, we propose efficient CLPC representations that retain some useful properties of conventional LPC representations. Through quantitative and qualitative evaluations, we demonstrate that our proposed representations increase quantization efficiency and improve audio coding performance.

关键词： Quantization (signal) audio coding Time-domain analysis Transient analysis Training Time-frequency analysis Discrete Fourier transforms Immittance spectral pair linear prediction line spectral pair LPC representations

来源：评论

学校读者我要写书评

暂无评论

Union of MDCT Bases for audio coding

引用

IEEE TRANSACTIONS ON audio SPEECH AND LANGUAGE PROCESSING 2008年第8期16卷 1361-1372页

作者： Ravelli, Emmanuel Richard, Gal Daudet, Laurent Univ Paris 06 Inst Jean le Rond Alembert LAM F-75015 Paris France GET ENST Tlcom Paris TSI Dept F-75014 Paris France

This paper investigates the use of sparse overcomplete decompositions for audio coding. audio signals are decomposed over a redundant union of modified discrete cosine transform (MDCT) bases having eight different scales. This approach produces a sparser decomposition than the traditional MDCT-based orthogonal transform and allows better coding efficiency at low bitrates. Contrary to state-of-the-art low bitrate coders, which are based on pure parametric or hybrid representations, our approach is able to provide transparency. Moreover, we use a bitplane encoding approach, which provides a fine-grain scalable coder that can seamlessly operate from very low bitrates up to transparency. Objective evaluation, as well as listening tests, show that the performance of our coder is significantly better than a state-of-the-art transform coder at very low bitrates and has similar performance at high bitrates. We provide a link to test soundfiles and source code to allow better evaluation and reproducibility of the results.

关键词： audio coding matching pursuit scalable coding signal representations sparse representations

来源：评论

学校读者我要写书评

暂无评论

A perceptual model for sinusoidal audio coding based on spectral integration

引用

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING 2005年第9期2005卷 1292-1304页

作者： van de Par, S Kohlrausch, A Heusdens, R Jensen, J Jensen, SH Philips Res Labs Digital Signal Proc Grp NL-5656 AA Eindhoven Netherlands Eindhoven Univ Technol Dept Technol Management NL-5600 MB Eindhoven Netherlands Delft Univ Technol Dept Mediamat NL-2600 GA Delft Netherlands Aalborg Univ Inst Electron Syst Dept Commun Technol DK-9220 Aalborg Denmark

Psychoacoustical models have been used extensively within audio coding applications over the past decades. Recently, parametric coding techniques have been applied to general audio and this has created the need for a psychoacoustical model that is specifically suited for sinusoidal modelling of audio signals. In this paper, we present a new perceptual model that predicts masked thresholds for sinusoidal distortions. The model relies on signal detection theory and incorporates more recent insights about spectral and temporal integration in auditory masking. As a consequence, the model is able to predict the distortion detectability. In fact, the distortion delectability defines a (perceptually relevant) norm on the underlying signal space which is beneficial for optimisation algorithms such as rate-distortion optimisation or linear predictive coding. We evaluate the merits of the model by combining it with a sinusoidal extraction method and compare the results with those obtained with the ISO MPEG-1 Layer I-II recommended model. Listening tests show a clear preference for the new model. More specifically, the model presented here leads to a reduction of more than 20% in terms of number of sinusoids needed to represent signals at a given quality level.

关键词： audio coding psychoacoustical modelling auditory masking spectral masking sinusoidal modelling psychoacoustical matching pursuit

来源：评论

学校读者我要写书评

暂无评论

A novel audio coding scheme using warped linear prediction model and the discrete wavelet transform

引用

IEEE TRANSACTIONS ON audio SPEECH AND LANGUAGE PROCESSING 2006年第6期14卷 2039-2048页

作者： Deriche, Mohamed Ning, Daryl King Fahd Univ Petr & Minerals Dept Elect Engn Dhahran 31261 Saudi Arabia Mathworks Sydney NSW Australia

In this paper, we present a novel audio coder using the discrete wavelet transform (DWT) and warped linear prediction (WLP). In contrast to conventional LP, WLP allows for the control of frequency resolution to closely match the response of the human auditory system. The structure of the system is similar to the transform coded excitation techniques used in wideband speech coding, where LP has been replaced with WLP, and the residual is analyzed by a wavelet filterbank designed to approximate the critical bands. The inherent shaping of the WLP synthesis filter, and a controlled bit allocation to the wavelet coefficients helps minimise the perceptually significant noise due to the quantization error in the residual. For monophonic signals sampled at 44.1 kHz, the coder achieves near transparent to transparent quality for a variety of speech and music signals at an average bitrate of about 64 kb/s. Tests also show that the coder (in its initial implementation) delivers superior quality to the MPEG layer III and comparable quality to the MPEG2-AAC codec when operating at the same bitrate.

关键词： audio coding subband coding transform coding warped linear prediction

来源：评论

学校读者我要写书评

暂无评论

Analysis by synthesis spatial audio coding

引用

IET SIGNAL PROCESSING 2014年第1期8卷 30-38页

作者： Elfitri, Ikhwana Shi, Xiyu Kondoz, Ahmet Univ Surrey Lab Multimedia Commun Res Guildford GU2 7XH Surrey England Andalas Univ Fac Engn Dept Elect Engn Padang Indonesia Mulsys Ltd Guildford Surrey England

This study presents a novel spatial audio coding (SAC) technique, called analysis by synthesis SAC (AbS-SAC), with a capability of minimising signal distortion introduced during the encoding processes. The reverse one-to-two (R-OTT), a module applied in the MPEG Surround to down-mix two channels as a single channel, is first configured as a closed-loop system. This closed-loop module offers a capability to reduce the quantisation errors of the spatial parameters, leading to an improved quality of the synthesised audio signals. Moreover, a sub-optimal AbS optimisation, based on the closed-loop R-OTT module, is proposed. This algorithm addresses a problem of practicality in implementing an optimal AbS optimisation while it is still capable of improving further the quality of the reconstructed audio signals. In terms of algorithm complexity, the proposed sub-optimal algorithm provides scalability. The results of objective and subjective tests are presented. It is shown that significant improvement of the objective performance, when compared to the conventional open-loop approach, is achieved. On the other hand, subjective test show that the proposed technique achieves higher subjective difference grade scores than the tested advanced audio coding multichannel.

关键词： single-channel Signal distortion Subjective testing Algorithmic complexity audio coding audio signals

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：