检索结果-内蒙古大学图书馆

Design of orthogonal filter banks using a multi-objective genetic algorithm for a speech coding scheme

ALEXANDRIA ENGINEERING JOURNAL 2022年第10期61卷 7649-7657页

作者： Boukhobza, Abdelkader Taleb, Nasreddine Taleb-Ahmed, Abdelmalik Bounoua, Abdennacer Hassiba Benbouali Univ Dept Elect Chlef 02000 Algeria Univ Djillali Liabes Sidi Bel Abbes RCAM Lab Sidi Bel Abbes 22000 Algeria UPHF IEMN UMR CNRS 8520 Valenciennes France

In this work, we propose an optimization scheme based on a multi-objective Genetic Algorithm (GA) for the design of orthogonal filter banks for speech compression. A parameterization is adopted to assure that the resulting filter banks satisfy perfect reconstruction and have at least two vanishing moments. We search for a parameter set that optimizes the coding gain and the frequency selectivity. As the objectives are conflicting, we investigate the solution that realizes the best compromise between the objectives criteria using the Non-dominated Sorting Genetic Algorithm (NSGAIII). Experimental results have shown that the optimized filter banks provide a significant gain in coding performances when comparing with the Daubechies orthogonal filter banks for test speech signals.

关键词： Discrete wavelet transform Orthogonal filter banks speech coding Optimized wavelet NSGAIII algorithm

来源：评论

学校读者我要写书评

暂无评论

Scalable speech coding for IP Networks: Beyond iLBC

引用

IEEE TRANSACTIONS ON AUDIO speech AND LANGUAGE PROCESSING 2013年第11期21卷 2337-2345页

作者： Seto, Koji Ogunfunmi, Tokunbo Santa Clara Univ Dept Elect Engn Santa Clara CA 95053 USA

High quality speech at low bit rates makes code excited linear prediction (CELP) the dominant choice for a narrowband coding technique despite the susceptibility to packet loss. One of the few techniques which received attention after the introduction of CELP coding technique is the internet low bitrate codec (iLBC) because of inherent high robustness to packet loss. Addition of rate flexibility and scalability makes the iLBC an attractive choice for voice communication over IP networks. In this paper, performance improvement schemes of multi-rate iLBC and its scalable structure are proposed, and the proposed codec enhanced from the previous work is re-designed based on the subjective listening quality instead of the objective quality. In particular, perceptual weighting and the modified discrete cosine transform (MDCT) with short overlap in weighted signal domain are employed along with the improved packet loss concealment (PLC) algorithm. The subjective evaluation results show that the speech quality of the proposed codec is equivalent to that of state-of-the-art codec, G.718, under both a clean channel condition and lossy channel conditions. This result is significant considering that development of the proposed codec is still in early stage.

关键词： Discrete cosine transform (DCT) internet low bitrate codec (iLBC) packet loss scalable coding speech coding voice over Internet protocol (VoIP)

来源：评论

学校读者我要写书评

暂无评论

A NEW ARTIFICIAL speech SIGNAL FOR OBJECTIVE QUALITY EVALUATION OF speech coding SYSTEMS

引用

IEEE TRANSACTIONS ON COMMUNICATIONS 1994年第2-4期42卷 664-672页

作者： ITOH, K KITAWAKI, N IRII, H NAGABUCHI, H NIPPON TELEGRAPH & TEL PUBL CORP TELECOMMUN NETWORKS LABS TOKYO 190 JAPAN

This paper describes a new artificial speech signal (ASVQ: Artificial speech by Vector Quantization technique) which reflects the average characteristics of the human voice. The ASVQ is intended for use as a test signal in the objective evaluation of speech coding system quality. To obtain the average characteristics, a very large speech data base is analyzed. The ASVQ generation method which reflects the extracted average characteristics of the human voice is formulated. This method applies vector quantizing analysis to the speech data base. The LPC speech synthesis circuit is used to reproduce the average characteristics. Finally, the new artificial speech signal is compared with a human voice and the estimation accuracy of the subjective quality of speech coding systems and nonlinear distortions is evaluated.

关键词： speech analysis Human voice speech coding Vector quantization System testing Data analysis Character generation Data mining Linear predictive coding speech synthesis

来源：评论

学校读者我要写书评

暂无评论

Distortion measures for speech coding

引用

IETE TECHNICAL REVIEW 1998年第4期15卷 251-258页

作者： De, A NORTEL Bell No Res Montreal PQ H3E 1H6 Canada

In this article, we have reviewed some of the existing subjective and objective measures used in the area of speech coding. The mean opinion score and the diagnostic acceptability measure are two of the widely used subjective measures. The most popular class of the time-domain measures is the signal-to-noise ratio (SNR) with its variants such as the segmental SNR, the granular segmentsal SNR etc. Among the spectral distortion measures, the log likelihood ratio measure, the lag area ratio measure, the log spectral distortion measure, the cepstral distance and the Itakura-Saito distortion measure are quite well-known. Some of the more recently proposed objective measures place emphasis on the perceptually significant aspects. Three such classes of the psychoacoustically-motivated measures are the information index, the Bark spectral distortion measure and the neural distance measure (e.g., the cochlear discrimination information, the cochlear hidden Markovian measues). The merit of considering important perceptual events is evident in the success of these measures.

关键词： speech coding

来源：评论

学校读者我要写书评

暂无评论

HADAMARD-TRANSFORMATION TECHNIQUE OF speech coding - SOME FURTHER RESULTS

引用

PROCEEDINGS OF THE INSTITUTION OF ELECTRICAL ENGINEERS-LONDON 1977年第10期124卷 845-852页

作者： FRANGOULIS, E TURNER, LF UNIV LONDON IMPERIAL COLL SCI & TECHNOLDEPT ELECT ENGNLONDON SW7 2BTENGLAND

The results of an extensive investigation of the properties of 64-point Hadamard transformed speech are presented. Detailed information is given about the probability density functions of the Hadamard coefficients, the average power-density spectrum in the Hadamard domain and the logical-autocorrelation function. The results indicate that good-quality speech can be reconstructed from 6 to 8 dominant Hadamard coefficients, but that the use of fewer coefficients is unlikely to lead to the reconstruction of speech of acceptable quality. The results of a preliminary series of listening tests are presented and these confirm conclusions drawn from the statistical properties of the transformed speech. It is shown that the number of bits needed for coefficient labelling constitutes a significant proportion of the total number of bits needed to represent Hadamard transformed speech. A technique is presented for reducing by more than 50% the number of labelling bits needed, and it is explained how, by using this technique, it should be possible to obtain good quality speech when using a transmission bit rate of 8 k bits/s.

关键词： digital communication systems Hadamard transformations Hadamard coefficients transforms Hadamard transformed speech probability density functions speech coding voice communication listening tests Algebra coefficient labelling Codes encoding

来源：评论

学校读者我要写书评

暂无评论

Low Bit-Rate speech coding Through Quantization of Mel-Frequency Cepstral Coefficients

引用

IEEE TRANSACTIONS ON AUDIO speech AND LANGUAGE PROCESSING 2012年第2期20卷 610-619页

作者： Boucheron, Laura E. De Leon, Phillip L. Sandoval, Steven New Mexico State Univ Klipsch Sch Elect & Comp Engn Las Cruces NM 88003 USA

In this paper, we propose a low bit-rate speech codec based on vector quantization (VQ) of the mel-frequency cepstral coefficients (MFCCs). We begin by showing that if a high-resolution mel-frequency cepstrum (MFC) is computed, good-quality speech reconstruction is possible from the MFCCs despite the lack of phase information. By evaluating the contribution toward speech quality that individual MFCCs make and applying appropriate quantization, our results show that the MFCC-based codec exceeds the state-of-the-art MELPe codec across the entire range of 600-2400 bps, when evaluated with the perceptual evaluation of speech quality (PESQ) (ITU-T recommendation P. 862). The main advantage of the proposed codec is in distributed speech recognition (DSR) since the MFCCs can be directly applied thus eliminating additional decode and feature extract stages;furthermore, the proposed codec better preserves the fidelity of MFCCs and better word accuracy rates as compared to CELP and MELPe codecs.

关键词： speech analysis speech coding

来源：评论

学校读者我要写书评

暂无评论

An Efficient FPGA-Based Accelerator for Perceptual Weighting Filter in speech coding

引用

IETE TECHNICAL REVIEW 2024年第4期41卷 441-453页

作者： Singh, Dilip Chandel, Rajeevan Natl Inst Technol Dept Elect & Commun Engn Hamirpur 177005 Himachal Prades India

In speech coding, denoising of the speech signal is essential as well as crucial. The filters for minimizing errors through denoising employ the autoregressive moving average (ARMA) approach, introducing higher computational complexity in speech coder design. This research work presents the design and implementation of an effective perceptual weighting filter (PWF) for speech coding. The high-level synthesis of the fixed-point PWF filter is optimized by multiple optimization techniques along with detailed design space exploration using the weighted sum (WS) method. To enhance the performance, an FPGA-based hardware accelerator is proposed using hardware/software (HW/SW) co-design in an embedded environment. Simulative analysis in Vivado HLS and final accelerator design in the Vitis IDE tool validate the proposed architecture by using real-time speech samples, demonstrating a 50% reduction in area and a 99% execution improvement. This makes it well-suited for use in modern speech codecs, enhancing the efficiency.

关键词： FPGA hardware acceleration HLS synthesis IIR filter perceptual weighting filter speech coding Xilinx Zynq ZYBO

来源：评论

学校读者我要写书评

暂无评论

THE DESIGN OF A HYBRID ADAPTIVE QUANTIZER FOR speech coding APPLICATIONS

引用

IEEE TRANSACTIONS ON COMMUNICATIONS 1988年第11期36卷 1193-1199页

作者： HALL, SC BRADLOW, HS UNIV CAPE TOWN DEPT ELECT & ELECTR ENGNCAPE TOWNSOUTH AFRICA

A new adaptive quantizer which uses a combination of instantaneous and syllabic adaptation is presented for use in speech codecs. It can be designed to adapt to changes in the mean, variance, and pdf shape of its input signal, and to quantize the signal using one or more bits/sample. It is therefore called the generalized hybrid adaptive quantizer (GHAQ). An efficient procedure for optimizing the GHAQ using a training sequence of signal samples is described, and the effects on the performance of the GHAQ of varying the memory length and the syllabic compandor time constant are investigated. It is found that an optimized version of the two-bit GHAQ offers improved signal-to-noise ratio over Jayant's adaptive quantizer with a one-word memory when it is used in a predictive speech codec with a zero-, first-, or second-order fixed predictor.< >

关键词： speech coding speech codecs Shape Delta modulation Quantization Cities and towns Optimization methods Signal design Signal to noise ratio Sampling methods

来源：评论

学校读者我要写书评

暂无评论

LPC speech coding BASED ON VARIABLE-LENGTH SEGMENT QUANTIZATION

引用

IEEE TRANSACTIONS ON ACOUSTICS speech AND SIGNAL PROCESSING 1988年第9期36卷 1437-1444页

作者： SHIRAKI, Y HONDA, M Speech Research Group of the Basic Research Laboratories NTT Musashino Tokyo Japan

A low-bit-rate linear predictive coder (LPC) that is based on variable-length segment quantization is presented. In this vocoder, the speech spectral-parameter sequence is represented as the concatenation of variable-length spectral segments generated by linearly time-warping fixed-length code segments. Both the sequence of code segments and the segment lengths are efficiently determined using a dynamic programming procedure. This procedure minimizes the spectral distance measured between the original and the coded spectral sequence in a given interval. An iterative algorithm is developed for designing fixed-length code segments for the training spectral sequence. It updates the segment boundaries of the training spectral sequence using an a priori codebook and updates the codebook using these segment sequences. The convergence of this algorithm is discussed theoretically and experimentally. In experiments, the performance of variable-length segment quantization for voice coding is compared to that of fixed-length segment quantization and vector quantization.< >

关键词： Linear predictive coding speech coding Quantization Algorithm design and analysis Iterative algorithms Vocoders speech processing Signal processing algorithms Bit rate Convergence

来源：评论

学校读者我要写书评

暂无评论

Tree coding combined with TDHS for speech coding at 6.4 and 4.8 kbps

引用

speech COMMUNICATION 1999年第1期29卷 23-37页

作者： Lee, I Gibson, JD Chungbuk Natl Univ Sch Elect & Elect Engn Cheongju 361763 Chungbuk South Korea So Methodist Univ Dept Elect Engn Dallas TX 75275 USA

Tree coding is combined with time domain harmonic scaling (TDHS) for speech coding at 6.4 and 4.8 kbps. In order to improve the robustness to channel errors, new pitch predictor, short-term predictor adaptation and gain adaptation methods are proposed for tree coder. New code trees with appropriate gain adaptation rules, new backward adaptive pitch predictor and robust short-term predictor adaptation algorithms are evaluated for both ideal and noisy channels. Paired comparison listening tests show that the 6.4 kbps coder (2-to-1 TDHS/2 bits/samples tree coding) has speech quality equivalent to 6 bit log-PCM at a sampling rate of 6400 samples/s. (C) 1999 Elsevier Science B.V. All rights reserved.

关键词： speech coding speech analysis

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：