检索结果-内蒙古大学图书馆

2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025

作者： Eichenseer, Andrea Korse, Srikanth Fuchs, Guillaume Multrus, Markus Fraunhofer IIS Erlangen Germany International Audio Laboratories Erlangen Erlangen Germany

ISBN: (纸本)9798350368741

The recently standardized 3GPP codec for Immersive Voice and Audio Services (IVAS) includes a parametric mode for efficiently coding multiple audio objects at low bit rates. In this mode, parametric side information is obtained from both the object metadata and the input audio objects. The side information comprises directional information, indices of two dominant objects, and the power ratio between these two dominant objects. It is transmitted to the decoder along with a stereo downmix. In IVAS, parametric object coding allows for transmitting three or four arbitrarily placed objects at bit rates of 24.4 or 32 kbit/s and faithfully reconstructing the spatial image of the original audio scene. Subjective listening tests confirm that IVAS provides a comparable immersive experience at lower bit rate and complexity compared to coding the audio objects independently using Enhanced Voice Services (EVS). © 2025 IEEE.

关键词： Audio coding communication codec Immersive Voice and Audio Services (IVAS) object-based audio parametric coding

来源：评论

学校读者我要写书评

暂无评论

Efficient parametric coding of transients

引用

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING 2006年第4期14卷 1340-1351页

作者： Christensen, Mads Graesboll van de Par, Steven Philips Res Labs Digital Signal Proc Grp Eindhoven Netherlands

In this paper, Methods for improved parametric coding of transients are presented. We propose a signal model for coding of transients consisting of a sum of simisoids each being amplitude-modulated by a different gamma envelope. These envelopes are characterized by an onset time, an attack and a decay parameter. An efficient method for estimating these parameters is presented. Further, methods are proposed that combine this transient model with a constant-amplitude sinusoidal model in order to achieve efficient coding of both stationary and transient signal parts. By rate-distortion optimization using a perceptual distortion measure, we combine variable rate bit allocation and segmentation in an optimal way. Formal, as well as informal, listening tests show that significant improvements can be achieved with the proposed model as compared to a state-of-the-art sinusoidal coder by the combination of optimal segmentation and amplitude modulated sinusoidal audio coding.

关键词： audio coding parametric coding quantization rate distortion theory signal representations

来源：评论

学校读者我要写书评

暂无评论

Amplitude modulated sinusoidal signal decomposition for audio coding

引用

IEEE SIGNAL PROCESSING LETTERS 2006年第7期13卷 389-392页

作者： Christensen, Mads Graesboll Jakobsson, Andreas Andersen, Soren Vang Jensen, Soren Holdt Aalborg Univ Dept Commun Technol DK-9220 Aalborg Denmark Karlstad Univ Dept Elect Engn SE-65188 Karlstad Sweden

In this letter, we present a decomposition for sinusoidal coding of audio, based on an amplitude modulation of sinusoids via a linear combination of arbitrary basis vectors. The proposed method, which incorporates a perceptual distortion measure, is based on a relaxation of a nonlinear least-squares minimization. Rate-distortion curves and listening tests show that, compared to a constant-amplitude sinusoidal coder, the proposed decomposition offers perceptually significant improvements in critical transient signals.

关键词： audio coding parametric coding signal representations sinusoidal

来源：评论

学校读者我要写书评

暂无评论

WAVENET BASED LOW RATE SPEECH coding

WAVENET BASED LOW RATE SPEECH CODING

引用

IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

作者： Kleijn, W. Bastiaan Lim, Felicia S. C. Luebs, Alejandro Skoglund, Jan Stimberg, Florian Wang, Quan Walters, Thomas C. Google Inc San Francisco CA 94107 USA DeepMind London England Victoria Univ Wellington Wellington New Zealand

ISBN: (纸本)9781538646588

Traditional parametric coding of speech facilitates low rate but provides poor reconstruction quality because of the inadequacy of the model used. We describe how a WaveNet generative speech model can be used to generate high quality speech from the bit stream of a standard parametric coder operating at 2.4 kb/s. We compare this parametric coder with a waveform coder based on the same generative model and show that approximating the signal waveform incurs a large rate penalty. Our experiments confirm the high performance of the WaveNet based coder and show that the speech produced by the system is able to additionally perform implicit bandwidth extension and does not significantly impair recognition of the original speaker for the human listener, even when that speaker has not been used during the training of the generative model.

关键词： Speech coding parametric coding WaveNet generative model

来源：评论

学校读者我要写书评

暂无评论

Speech coding at very low bit-rates for mobile communication 9

Speech coding at very low bit-rates for mobile communication

引用

9th Asia-Pacific Conference on Communications held in conjunction with the 6th Malaysia International Conference on Communications (MICC 2003)

作者： Gandhi, AG Dhekane, SS Maharashtra Inst Technol Dept Elect & Telecommun Pune Maharashtra India

ISBN: (纸本)0780381149

We, in this paper discuss the various basic speech coding techniques viz. Waveform coding, parametric coding and the Quantization schemes and review the 'Enhanced Waveform Interpolative coding' technique in detail. The EWI coding technique for. Low bit rates with several enhancements. like Analysis by Synthesis (AbS) optimization of Slowly Evolving Waveform (SEW), Rapidly Evolving Waveform (REW) parametrization, REW quantization, etc. proves to be very efficient for mobile communications. Also briefed are an enhanced Post-Filtering and a. novel Pitch Search technique for speech enhancement The subjective test results have indicated that the quality of the 2.8 Kb/s EWI exceeds that of the G.723.1 at 5.3 Kb/s. Based on the results, we conclude that speech coding low bit-rates especially the EWI coder has enormous vistas in future 4G Mobile systems, Internet Telephony, LEO systems, etc...

关键词： parametric coding waveform interpolative coding speech enhancement quantization post-filtering pitch search...

来源：评论

学校读者我要写书评

暂无评论

parametric stereo coder with only MDCT domain computations

Parametric stereo coder with only MDCT domain computations

引用

9th IEEE International Symposium on Signal Processing and Information Technology

作者： Suresh, K. Sreenivas, T. V. Coll Engn ECE Dept Thiruvananthapuram Kerala India Indian Inst Sci Dept Elect Commun Engn Bangalore Karnataka India

ISBN: (纸本)9781424459490

A parametric stereo coder in the MDCT domain is introduced in this work. Psychoacoustic modeling, parameter estimation and stereo synthesis are implemented in the MDCT domain. The encoder requires only MDCT domain computations and hence results in lower computational complexity In addition, a low-complexity bit allocation algorithm is used for adaptive quantization of the MDCT coefficients. Perceptual evaluation shows that the proposed audio coder performance is comparable with that of MPEG-2 AAC coder with lower complexity.

关键词： audio coding perceptual coding parametric coding

来源：评论

学校读者我要写书评

暂无评论

LPCNET: IMPROVING NEURAL SPEECH SYNTHESIS THROUGH LINEAR PREDICTION 44

LPCNET: IMPROVING NEURAL SPEECH SYNTHESIS THROUGH LINEAR PRE...

引用

44th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

作者： Valin, Jean-Marc Skoglund, Jan Mozilla Mountain View CA 94041 USA Google LLC San Francisco CA USA

ISBN: (纸本)9781479981311

Neural speech synthesis models have recently demonstrated the ability to synthesize high quality speech for text-to-speech and compression applications. These new models often require powerful GPUs to achieve real-time operation, so being able to reduce their complexity would open the way for many new applications. We propose LPCNet, a WaveRNN variant that combines linear prediction with recurrent neural networks to significantly improve the efficiency of speech synthesis. We demonstrate that LPCNet can achieve significantly higher quality than WaveRNN for the same network size and that high quality LPCNet speech synthesis is achievable with a complexity under 3 GFLOPS. This makes it easier to deploy neural synthesis applications on lower-power devices, such as embedded systems and mobile phones.

关键词： neural audio synthesis parametric coding WaveRNN

来源：评论

学校读者我要写书评

暂无评论

LOW COMPLEXITY TONALITY CONTROL IN THE INTELLIGENT GAP FILLING TOOL 41

LOW COMPLEXITY TONALITY CONTROL IN THE <i>INTELLIGENT GAP FI...

引用

41st IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

作者： Schmidt, Konstantin Neukam, Christian Fraunhofer Inst Integrated Circuits IIS Wolfsmantel 33 D-91058 Erlangen Germany

ISBN: (纸本)9781479999880

When coding audio signals at low bitrates with a transform coder the most prominent artifacts are spectral holes resulting from spectral lines being quantized to zero. State of the art codecs circumvent this by Noise Filling [1] and Bandwidth Extension (BWE) [2]. Both methods have in common that they do not code parts of the waveform itself but code a coarse description of the signal. At decoder side a synthetic signal is generated and adjusted according to the coded parameters. The presented system called Intelligent Gap Filling (IGF) is a combination of both methods. Spectral holes are filled with random noise or with copied decoded signal components from lower frequency regions. In the latter case a control mechanism is required to adjust the tonality of the copied signal components to reach good audio quality. This paper describes the way of controlling the tonality of IGF. The presented approach is of low complexity and allows for selective application without producing additional algorithmic delay. IGF is part of the 3GPP standard Enhanced Voice Services (EVS) as well as MPEG-H standardized by Moving Picture Experts Group (MPEG).

关键词： audio coding bandwidth extension tonality whitening parametric coding EVS MPEG-H

来源：评论

学校读者我要写书评

暂无评论

WAVENET BASED LOW RATE SPEECH coding

WAVENET BASED LOW RATE SPEECH CODING

引用

IEEE International Conference on Acoustics, Speech and Signal Processing

作者： W. Bastiaan Kleijn Felicia S. C. Lim Alejandro Luebs Jan Skoglund Florian Stimberg Quan Wang Thomas C. Walters Google Inc. San Francisco CA DeepMind London UK

ISBN: (纸本)9781538646595

关键词： Speech coding parametric coding WaveNet generative model speech coding Speech ENCODER parametric coding bit stream Loudspeakers signal waveform

来源：评论

学校读者我要写书评

暂无评论

Iterative Differential Evolution with Real Parameter Encoding

Iterative Differential Evolution with Real Parameter Encodin...

引用

International Conference on Computation, Automation and Knowledge Management (ICCAKM)

作者： Ashish Tripathi Arun Kumar Singh Amit Kumar Sirohi Prem Chand Vashist G. L. Bajaj Institute of Technology and Management Greater Noida India

ISBN: (数字)9781728106663

ISBN: (纸本)9781728106670

Evolutionary algorithms are a sub-discipline of artificial intelligence to solve various real-world problems. These algorithms are based on the Darwinian principle of evolution and so is the name evolutionary algorithm. Differential Evolution (DE) algorithm is a kind of evolutionary algorithms which are used for optimizing a problem mostly for real-valued functions. It uses random solutions and creates new solutions from the previous or existing solutions. This population based algorithm applies three operators namely selection, crossover and, mutation. In this work, a new strategy has been developed to improve the performance of the basic DE algorithm. Also, the resultant performance is compared to other optimization algorithms which show that modified DE is performing better than other existing algorithms.

关键词： artificial intelligence evolutionary computation mathematics computing evolutionary algorithm mathematics computing Artificial Intelligence parametric coding algorithms Optimization algorithms MT evolution

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：