检索结果-内蒙古大学图书馆

International Conference on Information, Communications and Signal Processing

作者： K.P. Padhi S. Kumar S. George STMicroelectronics Private Limited Singapore

Compression algorithms have a constant tradeoff between higher compression ratios at the cost of better quality. The number of bits assigned in the standard MPEG encoders is controlled by the signal to masking thresholds and the scalefactor calculations performed in the psychoacoustic model of the algorithm. The developed algorithm assigns lower bits to audio samples without significant degradation in quality.

关键词： Bit rate Decoding Frequency Humans Transform coding Psychoacoustic models Psychology audio coding Auditory system Microelectronics

来源：评论

学校读者我要写书评

暂无评论

Phase-based note onset detection for music signals

Phase-based note onset detection for music signals

引用

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： J.P. Bello M. Sandler Department of Electronic Engineering Queen Mary University of London London UK

Note onsets mark the beginning of attack transients, short areas of a note containing rapid changes of the signal spectral content. Detecting onsets is not trivial, especially when analysing complex mixtures. Applications for note onset detection systems include time stretching, audio coding and synthesis. An alternative to standard energy-based onset detection is proposed by using phase information. It is suggested that by observing the frame-by-frame distribution of differential angles, the precise moment when onsets occur can be detected with accuracy. Statistical measures are used to build the detection function. The system is tested and tuned on a database of complex recordings.

关键词： Phase detection Multiple signal classification Signal detection Steady-state audio coding Statistical analysis Music Signal synthesis System testing Databases

来源：评论

学校读者我要写书评

暂无评论

Quantifying perceptual distortion in scalably compressed MPEG audio

Quantifying perceptual distortion in scalably compressed MPE...

引用

Asilomar Conference on Signals, Systems & Computers

作者： C.D. Creusere Klipsch School of Electrical & Computer Engineering New Mexico State University USA

A scalably compressed bitstream is one which can be streamed and decoded at a wide variety of bitrates, and it is therefore compatible with communications channels of varying capacity. The audio coding portions of the MPEG 2 and 4 standards support fine-grained scalability through the use of bit slice arithmetic coding (BSAC). Human subjective analysis of BSAC, however, has shown that it performs poorly at low bitrates; seemingly random tonal patterns are superimposed on the actual audio. Here, we develop a new approach for objectively characterizing such distortion and validate it with human subjective trials. Unlike most other objective performance metrics, the proposed approach does not require sample-accurate sequence synchronization. As a comparison, we also apply the ITU-R BS.1387-1 objective testing recommendation to the same audio sequences and quantify how well it predicts the observed subjective quality.

关键词： Transform coding Bit rate Humans Streaming media Decoding Communication channels Channel capacity audio coding Scalability Arithmetic

来源：评论

学校读者我要写书评

暂无评论

Speed-change resistant audio fingerprinting using auto-correlation

Speed-change resistant audio fingerprinting using auto-corre...

引用

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： J. Haitsma T. Kalker Philips Research Laboratories Eindhoven Netherlands

At ISMIR 2002 and CBMI 2001, the authors presented a new approach to audio fingerprinting (Haitsma, J. and Kalker, T., Proc. Int. Conf. on Music Information Retrieval, p.107-15, 2002; Haitsma et al., Proc. Int. Workshop on Content-Based Multimedia Indexing, p.117-25, 2001). The proposed scheme, which we refer to as the streaming audio fingerprinting (SAF) system, allows a very efficient database lookup and is also very robust against many different audio processing steps, including low bit rate audio coding, noise addition and amplitude compression. However it is not inherently robust against large linear speed changes (i.e. speed changes larger than 2%) where both the pitch and the tempo change. This is a potential problem, because some radio stations speed up by a few percent. We discuss a modification of the originally proposed fingerprinting algorithm to make it robust against large linear speed changes. The proposed modification has negligible effect on other aspects, such as robustness and reliability.

关键词： Fingerprint recognition Autocorrelation Streaming media Noise robustness Music information retrieval Multimedia databases Indexing audio databases Bit rate audio coding

来源：评论

学校读者我要写书评

暂无评论

A scalable digital audio encoder based on embedded zerotree wavelet

引用

International Conference on Communication Technology (ICCT)

作者： Jianxin Yan Zaiwang Dong Weibei Dou Department of Electronic Engineering Tsinghua University Beijing China

ISBN: (纸本)7563506861

In this paper, a scalable audio scheme is presented, which is mainly based on an embedded zerotree wavelet (EZW) coding technology. Firstly, 29 critical subbands are obtained by splitting audio signals with a digital wavelet package transform (DPWT). Then a zerotree coding is acted on these subbands. Lastly, an entropy coding is applied to remove redundancy and a specific frame structure is formed. The resulting encoder can support a scalable bit stream from 16 kbps to 64 kbps with a 4 kbps step size for a single audio channel and the graceful degeneration of subjective audio quality can also be provided.

关键词： Psychoacoustic models audio coding Frequency Packaging Wavelet transforms Streaming media Psychology Signal processing Entropy coding MPEG 4 Standard

来源：评论

学校读者我要写书评

暂无评论

Flexible frequency decompositions for cosine-modulated filter banks

Flexible frequency decompositions for cosine-modulated filte...

引用

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： O.A. Niamut R. Heusdens Department of Mediamatics Delft University of Technnology Delft Netherlands

We investigate the use of nonuniform cosine-modulated filter banks for audio coding. A rate-distortion framework is employed, similar to the work in Herley et al. (1994), to select the filter bank structure from a large library of possible frequency decompositions. A new flexible frequency decomposition algorithm is proposed that jointly optimizes the filter bank structure and the bit allocation over the subband channels. Experimental results for both synthetic and real audio signals are provided. The new algorithm shows significant improvements in comparison with fixed uniform frequency decompositions, but special care has to be taken to reduce the size of the decomposition overhead.

关键词： Channel bank filters Filter bank audio coding Time frequency analysis Wavelet packets Libraries Bit rate Rate-distortion Radio spectrum management Electronic mail

来源：评论

学校读者我要写书评

暂无评论

Binaural cue coding: A novel and efficient representation of spatial audio

Binaural cue coding: A novel and efficient representation of...

引用

IEEE International Conference on Acoustics, Speech, and Signal Processing

作者： Faller, C Baumgarte, F Agere Syst Media Signal Proc Res Murray Hill NJ USA

ISBN: (纸本)0780374029

We present a novel concept for representing multi-channel audio signals: Binaural Cue coding (BCC). BCC aims at separating the basic audio content and the information relevant for spatial perception. A multi-channel audio signal is represented as a mono signal and BCC parameters. We present two types of applications of BCC. Firstly, a number of separate sound source signals are reduced to a mono signal and BCC parameters. In this case, the decoder has control over the location of each source in auditory space. In other words, the decoder can render spatial images as if the separate source signals were given. Secondly, a multi-channel audio signal is reduced to a mono signal and BCC parameters. In this case the decoder generates a multi-channel signal with a spatial image similar to the spatial image of the input signal of the encoder. Results from a subjective test suggest that BCC, combined with existing mono audio coders, offers better quality than conventional stereo and multi-channel perceptual transform audio coders for a wide range of bitrates.

关键词： audio coding hearing decoding binaural cue coding spatial audio images multichannel audio signals audio content spatial perception information mono signal decoder

来源：评论

学校读者我要写书评

暂无评论

audio coding based on rate-distortion and perceptual optimization

Audio coding based on rate-distortion and perceptual optimiz...

引用

Conference on Wavelet Applications VII

作者： Erne, M Moschytz, G Swiss Fed Inst Technol Inst Signal & Informat Proc CH-8092 Zurich Switzerland

ISBN: (纸本)0819436828

The time-frequency tiling, bit allocation and the quantizer of most perceptual coding algorithms is either fixed or controlled by a perceptual model. The large variety of existing audio signals, each exhibiting different coding requirements due to their different temporal and spectral fine-structure suggests to use a signal-adaptive algorithm. The framework which is described in this paper makes use of a signal-adaptive wavelet filterbank which allows to switch any node of the wavelet-packet tree individually. Therefore each subband can have an individual time-segmentation and the overall time-frequency tiling can be adapted to the signal using optimization techniques. A rate-distortion optimality can be defined which will minimize the distortion for a given rate in every subband, based on a perceptual model. Due to the additivity of the rate and distortion measure over disjoint covers of the input signal, an overall cost function including the switching cost for the filterbank switching can be defined. By the use of dynamic programming techniques, the wavelet-packet tree can be pruned based on a top-down or bottom-up "split-merge" decision in every node of the wavelet-tree. Additionally we can profit from temporal masking due to the fact that each subband can have an individual segmentation in time without introducing time domain artifacts such as pre-echo distortion.

关键词： MPEG audio coding perceptual model wavelet signal-adaptivity

来源：评论

学校读者我要写书评

暂无评论

audio coding using a psychoacoustic pre- and post-filter 25

Audio coding using a psychoacoustic pre- and post-filter

引用

IEEE International Conference on Acoustics, Speech, and Signal Processing

作者： Edler, B Schuller, G Lucent Technol Bell Labs Multimedia Commun Res Lab Murray Hill NJ USA

ISBN: (纸本)0780362934

A novel concept for perceptual audio coding is presented which is based on the combination of a pre- and post-filter, controlled by a psychoacoustic model, with a transform coding scheme. This paradigm allows modeling of the temporal and spectral shape of the masked threshold with a resolution independent of the used transform. By using frequency warping techniques the maximum possible detail for a given filter order can be made frequency-dependent and thus better adapted to the human auditory system. The filter coefficients are represented efficiently by LSF parameters which can be adaptively interpolated over time. First experiments with a system obtained by extending an existing transform codec showed that this approach can significantly improve the performance for speech signals, while the performance for other signals remained the same.

关键词： audio coding post-filter Psychoacoustic models Psychoacoustics Filter coefficient Spectral shape

来源：评论

学校读者我要写书评

暂无评论

Multiple description perceptual audio coding with correlating transforms

引用

IEEE TRANSACTIONS ON SPEECH AND audio PROCESSING 2000年第2期8卷 140-145页

作者： Arean, R Kovacevic, J Goyal, VK Bell Labs Lucent Technol Murray Hill NJ 07974 USA

In audio communication over a lossy packet network, concealment techniques are used to mitigate the effects of lost packets. This concealment is markedly improved if the compressed representation retains redundancy to aid in the estimation of lost information. A perceptual audio coder employing multiple description correlating transforms demonstrates this phenomenon.

关键词： audio coding multiple descriptions packetized audio robust communication

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：