检索结果-内蒙古大学图书馆

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： Juin-Hwey Chen Dongmei Wang Speech Coding Research Department AT and T Bell Laboratories Inc. Murray Hill NJ USA AT&T Bell Laboratories Georgia Institute of Technology Atlanta GA USA

ISBN: (纸本)0780331923

This paper presents a novel wideband speech coding algorithm called transform predictive coding (TPC). The main emphasis is on low complexity. TPC uses short-term and long-term prediction to remove the redundancy in speech. The prediction residual is quantized in the frequency domain based on a calculated noise masking threshold. In its simplest form, the TPC coder uses only open-loop quantization and therefore has a low complexity. A 16 kb/s full-duplex, open-loop TPC coder takes only 22% of the CPU load on a 150 MHz SGI Indy workstation and about 34% on a 90 MHz Pentium PC. The speech quality of TPC is almost transparent at 32 kb/s, very good at 24 kb/s, and acceptable at 16 kb/s. In the second half of the paper, we report our recent progress in using closed-loop quantization techniques to improve TPC output speech quality.

关键词： predictive coding Wideband Speech coding linear predictive coding Quantization Sampling methods Narrowband Workstations Speech processing Application software

来源：评论

学校读者我要写书评

暂无评论

ADAPTIVE POSTFILTER IN 16KBPS LD-CELP SPEECH CODER

ADAPTIVE POSTFILTER IN 16KBPS LD-CELP SPEECH CODER

引用

1996 3rd International Conference on Signal Processing(ICSP’96)

作者： Wang Bingxi He Yinghua Zhengzhou Henan China

ISBN: (纸本)0780329120

＜正＞In September 1992,the recommendation G 728,which is a 16kbps LD-CELP speech coder submitted by AT＆Twas standarized by *** the process of ratification test[1],the coder’s performances were equivalent to or better than that of 32kbps ADPCM for all conditions *** paper,which is based on a G.728 encoding-decoding system simulated in software,studies and tests different parts of the algorithm,espacially that of the postfitter

关键词： adaptive filters speech coding linear predictive coding filtering theory adaptive postfilter LD-CELP speech coder G.728 encoding-decoding system CCITT 16 kbit/s

来源：评论

学校读者我要写书评

暂无评论

Research on ASIC for multi-speaker isolated word recognition

Research on ASIC for multi-speaker isolated word recognition

引用

2nd International Conference on ASIC

作者： Xiong, B Sun, YH Institute of Microelectronics Tsinghua University Beijing China

ISBN: (纸本)7543909405

The ASIC for multi-speaker speech recognition is design in this paper. The LPC-derived cepstral coefficients are chosen as speech features. Templates are trained by K-means clustering algorithm. Two stage recognition system can not only improve recognition accuracy, but also reduce the delay. The first stage of recognition system uses speech spectrum difference(SSD) algorithm. The second stage uses DTW. The whole recognition system is design into ASIC on high level with VHDL and simulated in Powerview.

关键词： speech recognition linear predictive coding application specific integrated circuits cepstral analysis ASIC multi-speaker isolated word recognition LPC cepstral coefficients K-means clustering algorithm template training speech spectrum difference algorithm DTW high level design VHDL Powerview simulation

来源：评论

学校读者我要写书评

暂无评论

A 2.4kbps MBE-LPC speech codec algorithm suitable for VLSI implementation

A 2.4kbps MBE-LPC speech codec algorithm suitable for VLSI i...

引用

2nd International Conference on ASIC

作者： Yang, HM Chen, HY Sun, YH The Institute of Microelectronics Tsinghua University Beijing China

ISBN: (纸本)7543909405

A speech code/decode algorithm which combines MBE and LPC speech model is proposed. In this model, the spectral envelope is represented using linear Prediction Coefficients, which are coded using Line Spectrum Frequencies (LSFs). It can operate at 2.4 kbps with much higher quality of synthesis speech than LPC-10e and less computation complexity than CELP, VSELP and so on. Therefore it is particularly attractive for VLSI implementation.

关键词： speech codecs VLSI linear predictive coding computational complexity speech codec algorithm VLSI implementation MBE LPC spectral envelope linear prediction coefficients line spectrum frequencies computation complexity multiband excitation 2.4 kbit/s

来源：评论

学校读者我要写书评

暂无评论

Transparent quantization of speech LSP parameters based on KLT and 2-D-prediction

引用

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING 1996年第1期4卷 60-66页

作者： Jean, FR Wang, HC Department of Electrical Engineering National Tsing Hua University Hsinchu Taiwan

In this correspondence, a two-stage approach based on Karhunen-Loeve transform and 2-D prediction is proposed for efficient quantization of line spectrum pair (LSP) parameters of speech. Besides, a switched classifier is incorporated with this approach to reduce the outlier frames (spectral distortion greater than 2 dB) down to about 0.27% and to eliminate frames with spectral distortion greater than 4 dB at an average bit-rate below 19 b/frame.

关键词： Quantization Speech Karhunen-Loeve transforms Distortion measurement linear predictive coding Interpolation Councils Bit rate predictive models Vectors

来源：评论

学校读者我要写书评

暂无评论

Unknown-multiple signal source clustering problem using ergodic HMM and applied to speaker classification

Unknown-multiple signal source clustering problem using ergo...

引用

International Conference on Spoken Language, ICSLP

作者： J. Murakami M. Sugiyama H. Watanabe Information and Communication Systems Laboratories NTT Japan School of Computer Science and Engineering University of Aizu Japan

The authors consider signals originated from a sequence of sources. More specifically, the problems of segmenting such signals and relating the segments to their sources are addressed. This issue has wide applications in many fields. The report describes a resolution method that is based on an ergodic hidden Markov model (HMM), in which each HMM state corresponds to a signal source. The signal source sequence can be determined by using a decoding procedure (Viterbi algorithm or forward algorithm) over the observed sequence. Baum-Welch training is used to estimate HMM parameters from the training material. As an example of the multiple signal source classification problem, an experiment is performed on unknown speaker classification. The results show a classification rate of 79% for 4 male speakers. The results also indicate that the model is sensitive to the initial values of the ergodic HMM and that employing the long-distance LPC cepstrum is effective for signal preprocessing.

关键词： Hidden Markov models Loudspeakers Speech Viterbi algorithm Parameter estimation linear predictive coding Cepstrum Computer science Laboratories Electronic mail

来源：评论

学校读者我要写书评

暂无评论

Extension and complexity reduction of TwinVQ audio coder

Extension and complexity reduction of TwinVQ audio coder

引用

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： T. Moriya N. Iwakami K. Ikeda S. Miki NTT Human Interface Laboratories Musashino Tokyo Japan

This paper proposes two novel techniques for twinVQ (transform domain weighted interleave VQ) high-quality audio coding scheme for rates lower than 64 kbit/s. One is an extension of the weighted interleave technique to the time and input channel domains as well as the frequency domain. The other is an efficient representation scheme of the spectral envelope by means of a interpolated square root LPC (linear predictive coding) spectrum.

关键词： linear predictive coding Bit rate Robustness Audio coding Frequency domain analysis ISO standards IEC standards Quantization Decoding Humans

来源：评论

学校读者我要写书评

暂无评论

Improvement of the diver speech intelligibility in underwater communications using LPC

Improvement of the diver speech intelligibility in underwate...

引用

OCEANS

作者： A.V. Vassilev Department of Electronics Technical University of Varna Varna Bulgaria

This paper describes the models of oxy-helium speech corrector, speech coding and mask corrector applicable in underwater voice communication systems. The three problems have been solved using linear predictive coding... 详细信息

关键词： Underwater communication linear predictive coding Speech coding Resonance Speech analysis Acoustic noise Atmospheric modeling Digital signal processing Helium Atmosphere

来源：评论

学校读者我要写书评

暂无评论

Incremental speaker adaptation with minimum error discriminative training for speaker identification

Incremental speaker adaptation with minimum error discrimina...

引用

International Conference on Spoken Language, ICSLP

作者： C.M. del Alamo J. Alvarez C. de la Torre F.J. Poyatos L. Hernandez Speech Technology Group Telefònica Investigaciòn y Desarrollo Madrid Spain Universidad Alfonso X El Sabio Madrid Spain E.T.S.I. Telecomunicación Universite Politécnica de Madrid Spain

The minimum classification error (MCE) has been shown to be effective in improving the performance of a speaker identification system. However, there are still problems to solve, such as the variability of the voice characteristics of a particular speaker through time. In this paper, we analyze the degradation of a Gaussian mixture model (GMM) based text-independent speaker identification system when using test data recorded over six months after the training session, and, in an attempt to avoid this degradation, we study the use of supervised adaptation based on maximum a posteriori (MAP) estimation and MCE. These techniques have been shown to provide good results for speaker adaptation in speech recognition. The major result we have obtained is that, by starting with GMM models trained with only speech from session 1, similar identification results can be obtained for all the other sessions using an incremental adaptation using only 2.5 seconds of speech per speaker and session as data for the MCE training adaptation procedure. We have also found that, in our extreme experimental setup, MAP becomes unhelpful when combined with MCE adaptation.

关键词： Databases Speaker recognition Hidden Markov models Training data Parameter estimation Loss measurement State estimation Density functional theory linear predictive coding Cepstral analysis

来源：评论

学校读者我要写书评

暂无评论

Voiced/unvoiced/silence Classification of Speech Using 2-Stage Neural Networks with Delayed Decision Input

Voiced/unvoiced/silence Classification of Speech Using 2-Sta...

引用

International Symposium on Signal Processing and Its Applications (ISSPA)

作者： R. Ahn W.H. Holmes Speech & Signal Processing Laboratory School of Electrical Engineering University of New South Wales Sydney Australia

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：