检索结果-内蒙古大学图书馆

Design and implementation of a parametric speech coder

IEEE TRANSACTIONS ON CONSUMER ELECTRONICS 1998年第1期44卷 163-169页

作者： Kwong, S Nui, PT City Univ Hong Kong Dept Comp Sci Hong Kong Hong Kong

Currently, it has been found out that speech coders based on linear predictive coding are good and accurate in modeling speech utterances. It does not only modeled speech utterance in an accurate manner but it also has the following properties: i) it provides a good estimate of the vocal tract spectral envelope;ii) it is analytically tractable;iii) it can be implemented in either software or hardware;iv) it uses lesser data storage than many other approaches of speech coders. In this paper, a hybrid speech coder based on Code Excited linear predictive coding (CELPC) and Voice Excited linear predictive coding (VELPC) is presented. CELPC is currently one of the main techniques for producing high quality speech at around 4.8 kbps. However, the computation requirement for the original CELPC was too demanding and not suitable for practical use. Therefore, a hybrid approach to produce speech signals with high quality and with a bit rate of 3.478 kbps has been proposed. This method splits the speech signal into two portions, the base-band signal and the high-band signal in the frequency domain. The base-band and the high-band signal are then coded using the CELPC and VELPC techniques, respectively.

关键词： Speech coding linear predictive coding Signal processing Speech analysis Nonlinear filters predictive models Bit rate Quantization Encoding Computer science

来源：评论

学校读者我要写书评

暂无评论

Time-Varying Feature Selection and Classification of Unvoiced Stop Consonants

引用

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING 1994年第3期2卷 395-405页

作者： Nathan, Krishna S. Silverman, Harvey F. Brown Univ Div Engn LEMS Providence RI 02912 USA

A feature set that captures the dynamics of formant transitions prior to closure in a VCV environment is used to characterize and classify the unvoiced stop consonants. The feature set is derived from a time-varying, data-selective model for the speech signal. Its performance is compared with that of comparable formant data from a standard delta-LPC-based madel. The different feature sets are evaluated on a database composed of eight talkers. A 40% reduction in classification error rate is obtained by means of the time-varying model. The performance of three different classifiers is discussed. A novel adaptive algorithm, termed learning vector classifier (LVC) is compared with standard K-means and LVQ2 classifiers. LVC is a supervised learning classifier that improves performance by increasing the resolution of the decision boundaries. Error rates obtained for the three-way (p, t, and k) classification task using LVC and the time-varying analysis are comparable to that of techniques that make use of additional discriminating information contained in the burst. Further improvements are expected when an expanded time-varying feature set is utilized, coupled with information from the burst.

关键词： Speech Error analysis linear predictive coding Spatial databases Adaptive algorithm Supervised learning Information analysis Lips Teeth

来源：评论

学校读者我要写书评

暂无评论

MAXIMUM-LIKELIHOOD-ESTIMATION OF THE PARAMETERS OF DISCRETE FRACTIONALLY DIFFERENCED GAUSSIAN-NOISE PROCESS

引用

IEEE TRANSACTIONS ON SIGNAL PROCESSING 1993年第10期41卷 2977-2990页

作者： DERICHE, M TEWFIK, AH Department of Electrical Engineering University of Minnesota Minneapolis MN USA

A maximum likelihood estimation procedure is constructed for estimating the parameters of discrete fractionally differenced Gaussian noise from an observation set of finite size N. The procedure does not involve the computation of any matrix inverse or determinant. It requires N2/2 + O(N) operations. The expected value of the loglikelihood function for estimating the parameter d of fractionally differenced Gaussian noise (which corresponds to a parameter of the equivalent continuous-time fractional Brownian motion related to its fractal dimension) is shown to have a unique maximum with the range of allowable values of d. The maximum occurs at the true value of d. A Cramer-Rao bound on the variance of any unbiased estimate of d obtained from a finite size observation set is derived. It is shown experimentally that the maximum likelihood estimate of d is unbiased and efficient when finite size data sets are used in the estimation procedure. The proposed procedure is also extended to deal with noisy observations of discrete fractionally differenced Gaussian noise.

关键词： Maximum likelihood estimation Gaussian noise Parameter estimation Brownian motion Fractals Signal processing Speech processing linear predictive coding Filters Motion estimation

来源：评论

学校读者我要写书评

暂无评论

The Semi-Variogram and Spectral Distortion Measures for Image Texture Retrieval

引用

IEEE TRANSACTIONS ON IMAGE PROCESSING 2016年第4期25卷 1556-1565页

作者： Pham, Tuan D. Linkoping Univ Dept Biomed Engn S-58183 Linkoping Sweden

Semi-variogram estimators and distortion measures of signal spectra are utilized in this paper for image texture retrieval. On the use of the complete Brodatz database, most high retrieval rates are reportedly based on multiple features and the combinations of multiple algorithms, while the classification using single features is still a challenge to the retrieval of diverse texture images. The semi-variogram, which is theoretically sound and the cornerstone of spatial statistics, has the characteristics shared between true randomness and complete determinism and, therefore, can be used as a useful tool for both the structural and statistical analysis of texture images. Meanwhile, spectral distortion measures derived from the theory of linear predictive coding provide a rigorously mathematical model for signal-based similarity matching and have been proven useful for many practical pattern classification systems. Experimental results obtained from testing the proposed approach using the complete Brodatz database, and the the University of Illinois at Urbana-Champaign texture database suggests the effectiveness of the proposed approach as a single-feature-based dissimilarity measure for real-time texture retrieval.

关键词： Texture analysis image retrieval geostatistics semi-variogram linear predictive coding spectral distortion measures

来源：评论

学校读者我要写书评

暂无评论

Fast Methods for Code Search in CELP

引用

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING 1993年第3期1卷 315-325页

作者： Ahmed, M. Elshafei Al-Suwaiyel, M. I. King Fahd Univ Petr & Minerals Dhahran 31261 Saudi Arabia King Abdul Aziz City Sci & Technol Riyadh 11442 Saudi Arabia

The Code Excited linear predictive (CELP) technique has the potential for producing high quality synthetic speech at bit rates as low as 4.8 kb/s. Most of the complexity in the CELP coders comes from the search used to select an optimal excitation sequence from a code book of stochastic vectors. This paper describes three fast search methods. The key idea here is to inverse filter the actual speech by the formant and pitch filters to produce a residual error sequence (RES). The residual error is used to identify a neighborhood or a subset of codes for further processing. The first method, called Dynamic Nearest Neighborhood (DNN), attempts to dynamically construct a neighborhood of the 6 codes of maximum correlation with the residual error. The second method, called Nearest Fixed Neighborhood (NFN), clusters the code book into a fixed number of cells, then code search is performed on the codes of the nearest cell to the RES. The two methods achieve a reduction in the search procedure by a factor of 8-20 times. The third method combines the advantages of the first two methods to attain a reduction of operations from 40 to 50 times. The performance of these techniques and some of their ramifications will also be addressed.

关键词： Nonlinear filters Speech coding Books Digital filters Speech synthesis linear predictive coding Speech enhancement White noise Hafnium Cities and towns

来源：评论

学校读者我要写书评

暂无评论

On-line signature verification using LPC cepstrum and neural networks

引用

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS 1997年第1期27卷 148-153页

作者： Wu, QZ Jou, IC Lee, SY TELECOMMUN LABS CHUNGLITAIWAN

In this paper, an on-line signature verification scheme based on linear Prediction coding (LPC) cepstrum and neural networks is proposed. Cepstral coefficients derived from linear predictor coefficients of the writing trajectories are calculated as the features of the signatures. These coefficients are used as inputs to the neural networks. A number of single-output multilayer perceptrons (MLP's), as many as the number of words in the signature, are equipped for each registered person to verify the input signature. If the summation of output values of all MLP's is larger than verification threshold, the input signature is regarded as a genuine signature;otherwise, the input signature is a forgery. Simulations show that this scheme can detect the genuineness of the input signatures from our test database with an error rate as low as 4%.

关键词： Handwriting recognition linear predictive coding Cepstrum Neural networks Cepstral analysis Writing Trajectory Multilayer perceptrons Forgery Testing

来源：评论

学校读者我要写书评

暂无评论

NETWORK-BASED ISOLATED DIGIT RECOGNITION USING VECTOR QUANTIZATION

引用

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING 1985年第4期33卷 850-867页

作者： KOPEC, GE BUSH, MA Schlumberger Computer Aided Systems Research Palo Alto CA USA

This paper describes a network-based approach to speaker-independent digit recognition. The digits are modeled by a pronunciation network whose arcs represent classes of acoustic-phonetic segments. Each arc is associated with a matcher for rating an input speech interval as an example of the corresponding segment class. The matchers are based on vector quantization of LPC spectra. Recognition involves finding a minimum quantization distortion path through the network by dynamic programming. The system has been evaluated in an extensive series of speaker-independent isolated digit (one-nine, oh and zero) recognition experiments using a 225-talker. multidialect database developed by Texas Instruments (TI). The best recognizer configurations achieved accuracies of 97-99 percent on the TI database.

关键词： Vector quantization Hidden Markov models Pattern matching Databases Vocabulary Acoustic testing Speech linear predictive coding Dynamic programming Instruments

来源：评论

学校读者我要写书评

暂无评论

A FREQUENCY-WEIGHTED ITAKURA SPECTRAL DISTORTION MEASURE AND ITS APPLICATION TO SPEECH RECOGNITION IN NOISE

引用

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING 1988年第1期36卷 41-48页

作者： SOONG, FK SONDHI, MM AT and T Bell Laboratories Inc. Murray Hill NJ USA

The authors propose an adaptively weighted Itakura distortion measure. They studied its effects on the performance of a conventional dynamic time-warping (DTW)-based speech recognizer in a series of speaker-independent, isolated-digit-recognition experiments. The equivalent SNR improvement achieved by using the proposed weighted Itakura distortion at low SNRs is about 5-7 dB.< >

关键词： Distortion measurement Frequency measurement Speech recognition linear predictive coding Acoustic distortion Gain measurement Noise measurement Testing Signal to noise ratio predictive models

来源：评论

学校读者我要写书评

暂无评论

Discriminative training for neural predictive coding applied to speech features extraction

Discriminative training for neural predictive coding applied...

引用

International Joint Conference on Neural Networks (IJCNN)

作者： M. Chetouani B. Gas J.L. Zarader C. Chavy Laboratoire des Instruments et Systèmes (d'IIe de France (LISIF) Université Paris VI Paris Cedex 05 FRANCE

We present a predictive neural network called neural predictive coding (NPC). This model is used for nonlinear discriminant features extraction applied to phoneme recognition. We validate the nonlinear prediction improvement of the NPC model. We also, present a new extension of the NPC model: NPC-3. In order to evaluate the performances of the NPC-3 model, we carried out a study of Darpa-Timit phonemes (in particular /b/, /d/, /g/ and /p/, /t/, /q/ phonemes) recognition. Comparisons with traditional coding methods are presented. We also show how an adaptative constraint allows improvements on the recognition task.

关键词： predictive coding Speech coding Feature extraction predictive models Nonlinear filters linear predictive coding Neural networks Speech recognition Mel frequency cepstral coefficient Instruments

来源：评论

学校读者我要写书评

暂无评论

Adaptively Weighted L2-Minimization in predictive Speech coding

Adaptively Weighted L2-Minimization in Predictive Speech Cod...

引用

MILCOM, Military Communications Conference

作者： George Benke L. Thomas Ramsey Speech and Signal Processing Laboratory George town University McLean VA USA MITRE Corporation Speech and Signal Processing Laboratory McLean Virginia 22102

Current narrow-band speech coding algorithms (for transmission rates of 2400-4800 bits per second) typically excite linear filters with impulse trains to model voiced speech. The excitation function that would reproduce the speech exactly is the prediction residual; however, the usual selection of filter coefficients does not produce the most pulse-like prediction residuals. Thus other choices for filters offer an opportunity to improve the quality of narrow-band coding. The strategy of this paper is to minimize a dynamically weighted prediction error, to allow the largest values of the prediction residuals to be unconstrained and thus make the residuals more pulse-like. The idea was tested on voiced speech with over 300 predictive models, which contained quadratic and cubic terms as well as linear. Models with fewer than eight terms were not enhanced. The idea worked well with other models, particularly those with 8 to 11 terms.

关键词： predictive models Speech coding linear predictive coding Speech synthesis Narrowband Nonlinear filters Speech enhancement Speech processing Signal processing algorithms Testing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：