检索结果-内蒙古大学图书馆

International Conference on Spoken Language, ICSLP

作者： C.J. Long S. Datta Department of Electronic and Electrical Engineering Loughborough University of Technology Loughborough UK

In an effort to provide a more efficient representation of the acoustical speech signal in the pre classification stage of a speech recognition system, we consider the application of the Best-Basis Algorithm of R.R. Coifman and M.L. Wickerhauser (1992). This combines the advantages of using a smooth, compactly supported wavelet basis with an adaptive time scale analysis, dependent on the problem at hand. We start by briefly reviewing areas within speech recognition where the wavelet transform has been applied with some success. Examples include pitch detection, formant tracking, phoneme classification. Finally, our wavelet based feature extraction system is described and its performance on a simple phonetic classification problem given.

关键词： Feature extraction Multiresolution analysis Continuous wavelet transforms Discrete wavelet transforms Speech recognition Wavelet analysis Wavelet transforms Time frequency analysis Signal analysis linear predictive coding

来源：评论

学校读者我要写书评

暂无评论

A MIXED EXCITATION LPC VOCODER MODEL FOR LOW BIT-RATE SPEECH coding

引用

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING 1995年第4期3卷 242-250页

作者： MCCREE, AV BARNWELL, TP GEORGIA INST TECHNOL SCH ELECT ENGNATLANTAGA 30332

Traditional pitch-excited linear predictive coding (LPC) vocoders use a fully parametric model to efficiently encode the important information in human speech. These vocoders can produce intelligible speech at low data rates (800-2400 b/s), but they often sound synthetic and generate annoying artifacts such as buzzes, thumps, and tonal noises. These problems increase dramatically if acoustic background noise is present at the speech input. This paper presents a new mixed excitation LPC vocoder model that preserves the low bit rate of a fully parametric model but adds more free parameters to the excitation signal so that the synthesizer can mimic more characteristics of natural human speech. The new model also eliminates the traditional requirement for a binary voicing decision so that the vocoder performs well even in the presence of acoustic background noise. A 2400-b/s LPC vocoder based on this model has been developed and implemented in simulations and in a real-time system. Formal subjective testing of this coder confirms that it produces natural sounding speech even in a difficult noise environment. In fact, diagnostic acceptibility measure (DAM) test scores show that the performance of the 2400-b/s mixed excitation LPC vocoder is close to that of the government standard 4800-b/s CELP coder.

关键词： linear predictive coding Vocoders Bit rate Speech coding Parametric statistics Humans Acoustic noise Background noise Speech enhancement Acoustic testing

来源：评论

学校读者我要写书评

暂无评论

EFFICIENT ADAPTIVE VECTOR QUANTIZATION OF LPC PARAMETERS

引用

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING 1995年第4期3卷 314-317页

作者： FERRERBALLESTER, MA FIGUEIRASVIDAL, AR ETSI Telecomunicacóin Universidad de Las Palmas de Gran Canaria Las Palmas Spain GPSS-DSSR ETSI Telecomunicación Universidad Politécnica de Madrid Madrid Spain

This correspondence presents a new two-stage adaptive vector quantizer of LSF parameters in LPC speech coding. The first codebook is adapted by a partition-delete operation, whereas the code-vectors of the second codebook remain unchanged. The objective and subjective evaluations show that the proposed scheme offers transparent quantization with 22 b/frame.

关键词： Vector quantization linear predictive coding Speech coding Frequency measurement Speech analysis Bit rate Telecommunication standards Weight measurement Signal analysis Proposals

来源：评论

学校读者我要写书评

暂无评论

PARALLEL, SELF-ORGANIZING, HIERARCHICAL NEURAL NETWORKS WITH CONTINUOUS INPUTS AND OUTPUTS

引用

IEEE TRANSACTIONS ON NEURAL NETWORKS 1995年第5期6卷 1037-1044页

作者： ERSOY, OK DENG, SW School of Electrical Engineering Purdue University Lafayette IN USA Electrical Engineering Department Hua-Fan College

Parallel, self organizing, hierarchical neural networks (PSHNN's) are multistage networks in which stages operate in parallel rather than in series during testing, Each stage can be any particular type of network, Previous PSHNN's assume quantized, say, binary outputs, A new type of PSHNN is discussed such that the outputs are allowed to be continuous-valued. The performance of the resulting networks is tested in the problem of predicting speech signal samples from past samples, Three types of networks in which the stages are learned by the delta rule, sequential least-squares, and the backpropagation (BP) algorithm, respectively, are described, In all cases studied, the new networks achieve better performance than linear prediction, A revised BP algorithm is discussed for learning input nonlinearities. When the BP algorithm is to be used, better performance is achieved when a single;BP network is replaced by a PSHNN of equal complexity in which each stage is a BP network of smaller complexity than the single BP network.

关键词： Neural networks Speech Backpropagation algorithms linear predictive coding Signal processing algorithms Automatic testing Mean square error methods Laser sintering Vectors Autocorrelation

来源：评论

学校读者我要写书评

暂无评论

DESIGN OF A PITCH SYNCHRONOUS INNOVATION CELP CODER FOR MOBILE COMMUNICATIONS

引用

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS 1995年第1期13卷 31-41页

作者： MANO, K MORIYA, T MIKI, S OHMURO, H IKEDA, K IKEDO, J NIPPON TELEGRAPH & TEL PUBL CORP YOKOSUKA ELECT COMMUN LABS RADIO COMMUN SYST LABS YOKOSUKA KANAGAWA 23803 JAPAN

This paper describes the design of a speech coder called pitch synchronous innovation CELP (PSI-CELP) for low bit-rate mobile communications. PSI-CELP is based on CELP, but has more adaptive excitation structures. In voiced frames, instead of conventional random excitation vectors, PSI-CELP converts even the random excitation vectors to have pitch periodicity by repeating stored random vectors as well as by using an adaptive codebook. In silent, unvoiced, and transient frames, the coder stops using the adaptive codebook and switches to fixed random codebooks. The PSI-CELP coder also implements novel structures and techniques: an FIR-type perceptual weighting filter using unquantized LPC parameters, a random codebook with a conjugate structure trained to be robust against channel errors, codebook search with delayed decision, a gain quantization with sloped amplitude, and a moving average prediction coding of LSP parameters. Our speech coder is implemented by DSP chips. Its coded speech quality at 3.6 kb/s with 2.0 kb/s redundancy is comparable to that of the Japanese full-rate VSELP coder at 6.7 kb/s with 4.5 kb/s redundancy. The basic structure of this PSI-CELP coder has been chosen as the Japanese half-rate speech codec for digital cellular telecommunications.

关键词： Technological innovation Speech Redundancy Mobile communication Switches Filters linear predictive coding Robustness Delay Quantization

来源：评论

学校读者我要写书评

暂无评论

Subspectral modeling in filter banks

引用

IEEE TRANSACTIONS ON SIGNAL PROCESSING 1995年第12期43卷 3050-3053页

作者： Benyassine, A Akansu, AN Department of Electrical and Computer Engineering Center for Communications and Signal Processing Research New Jersey Institute of Technology University Heights Newark NJ USA

This correspondence deal with spectral modeling in filter banks. It is shown, both theoretically and experimentally, that subspectral modeling is superior to full spectrum modeling if performed before the rate change. The price paid for this performance improvement is an increase of computations. A few different signal sources were considered in this study. It is shown that the performance of AR and ARMA techniques are comparable in subspectral modeling. The first is desired because of its simplicity. As an application of this study, we implemented a CELP based speech codec embedded in a filter bank structure. We found that there were no performance improvements of subband CELP technique over the fullband case. The theoretical reasonings of the experimental results are also given in this correspondence.

关键词： Channel bank filters linear predictive coding Filter bank Signal resolution Spectral analysis predictive models Speech coding Transfer functions Speech codecs Bandwidth

来源：评论

学校读者我要写书评

暂无评论

REDUCTION OF BROAD-BAND NOISE IN SPEECH BY TRUNCATED QSVD

引用

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING 1995年第6期3卷 439-448页

作者： JENSEN, SH HANSEN, PC HANSEN, SD SORENSEN, JA TECH UNIV DENMARK INST ELECTRDK-2800 LYNGBYDENMARK

We consider an algorithm for reduction of broadband noise in speech based on signal subspaces. The algorithm is formulated by means of the quotient singular value decomposition (QSVD). With this formulation, a prewhitening operation becomes an integral part of the algorithm. We demonstrate that this is essential in connection with updating issues in real-time recursive applications. We also illustrate by examples that we are able to achieve a satisfactory quality of the reconstructed signal.

关键词： Noise reduction Speech enhancement linear predictive coding Acoustic noise Telephony Microphones Noise cancellation Speech analysis Speech synthesis Matrix decomposition

来源：评论

学校读者我要写书评

暂无评论

1.6-KBIT/S LP VOCODER USING TIME ENVELOPE

引用

ELECTRONICS LETTERS 1995年第7期31卷 517-519页

作者： ATKINSON, IA KONDOZ, AM EVANS, BG Centre for Satellite Engineering Speech Coding Group University of Surrey Guildford United Kingdom

The authors present a linear prediction (LP) based vocoder in which speech waveforms are considered as having a time envelope, the shape of which contains important perceptual information. By ensuring that the time envelope of the synthetic speech closely matches that of the original, natural sounding synthetic speech can be achieved at 1.6kbit/s.

关键词： linear predictive coding VOCODERS

来源：评论

学校读者我要写书评

暂无评论

CODEBOOK GENERATION FOR VECTOR QUANTIZATION

引用

ELECTRONICS LETTERS 1995年第7期31卷 522-523页

作者： CHEN, CQ KOH, SN SIVAPRAKASAPILLAI, P School of Electrical and Electronic Engineering Nanyang Technological University Singapore Republic of Singapore

A novel codebook generation scheme for vector quantisation is presented. The proposed scheme is of comparable computational complexity to the Linde-Buzo-Gray (LBG) algorithm, but its performance is shown to be superior.

关键词： VECTOR QUANTIZATION linear predictive coding

来源：评论

学校读者我要写书评

暂无评论

THEORETICAL-ANALYSIS OF THE HIGH-RATE VECTOR QUANTIZATION OF LPC PARAMETERS

引用

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING 1995年第5期3卷 367-381页

作者： GARDNER, WR RAO, BD University of California San Diego CA USA QUALCOMM Incorporated San Diego CA USA University of California San Diego San Diego CA USA

This paper presents a theoretical analysis of high-rate vector quantization (VQ) systems that use suboptimal, mismatched distortion measures, and describes the application of the analysis to the problem of quantizing the linear predictive coding (LPC) parameters in speech coding systems, First, it is shown that in many high-rate VQ systems the quantization distortion approaches a simple quadratically weighted error measure, where the weighting matrix is a ''sensitivity matrix'' that is an extension of the concept of the scalar sensitivity. The approximate performance of VQ systems that train and quantize using mismatched distortion measures is derived, and is used to construct better distortion measures, Second, these results are used to determine the performance of LPC vector quantizers, as measured by the log spectral distortion (LSD) measure, which have been trained using other error measures, such as mean-squared (MSE) or weighted mean-squared error (WMSE) measures of LPC parameters, reflection coefficients and transforms thereof, and line spectral pair (LSP) frequencies, Computationally efficient algorithms for computing the sensitivity matrices of these parameters are described. In particular, it is shown that the sensivity matrix for the LSP frequencies is diagonal, implying that a WMSE measure of LSP frequencies converges to the LSD measure in high-rate VQ systems, Experimental results to support the theoretical performance estimates are provided.

关键词： Vector quantization linear predictive coding Distortion measurement Frequency measurement Speech analysis Particle measurements Speech coding Weight measurement Reflection Estimation theory

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：