检索结果-内蒙古大学图书馆

The MDL criterion for rank determination via effective singular values

IEEE TRANSACTIONS ON SIGNAL PROCESSING 1998年第6期46卷 1741-1744页

作者： Zarowski, CJ Queens Univ Dept Elect & Comp Engn Kingston ON K7L 3N6 Canada

Konstantinides and Yao have considered the problem of rank determination by use of effective singular values. In this correspondence, we show how to use the minimum description length criterion of Rissanen to provide an alternative means of estimating the index of the smallest nonzero singular value of a matrix when given estimates of the singular values.

关键词： Finite impulse response filter Acoustic noise Speech enhancement linear predictive coding Signal to noise ratio Speech analysis Speech synthesis Band pass filters Signal processing Speech coding

来源：评论

学校读者我要写书评

暂无评论

A SIMPLE NONITERATIVE SPEECH EXCITATION ALGORITHM USING THE LPC RESIDUAL

引用

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING 1985年第2期33卷 432-434页

作者： ALEXANDER, ST Department of Electrical and Computer Engineering Center for Communications and Signal Processing North Carolina State University Raleigh NC USA

This paper provides an analytical derivation of a simple noniterative technique for extracting a multiple impulse excitation model for synthesized speech directly from the LPC residual sequence. While suboptimal with respect to "multipulse" techniques, this method is very applicable for speech enhancement where processor capability is limited. The results suggest an additional "orthogonality" requirement between the excitation sequence and the resulting prediction error, which aids in the intuitive understanding of the method.

关键词： linear predictive coding Speech synthesis Speech enhancement Bit rate Detectors Degradation Speech analysis Filters Signal processing algorithms Switches

来源：评论

学校读者我要写书评

暂无评论

Joint MELP turbo code with unequal error protection

引用

ELECTRONICS LETTERS 2001年第10期37卷 637-639页

作者： Wang, Q Koh, SN Nanyang Technol Univ Sch Elect & Elect Engn Singapore 639798 Singapore

The mixed excitation linear prediction (MELP) algorithm has been recently selected as the new federal standard for 2.4kbit/s coding of speech signals. The authors exploit the average residual inter-frame correlation and the error sensitivities of the bits in a MELP frame to enhance the robustness of the proposed joint MELP turbo coding schemes for Operations over Rayleigh fading channels.

关键词： BER speech coding mixed excitation linear prediction algorithm error statistics error detection combined source-channel coding turbo codes Codes unequal error protection correlation theory linear predictive coding residual inter-frame correlation Rayleigh channels bit error sensitivities Speech and audio coding Rayleigh fading channels Information theory joint MELP turbo code 2.4 kbit/s error correction MELP algorithm speech signals

来源：评论

学校读者我要写书评

暂无评论

THE EFFECT OF NARROW-BAND DIGITAL PROCESSING AND BIT ERROR RATE ON THE INTELLIGIBILITY OF ICAO SPELLING ALPHABET WORDS

引用

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING 1987年第8期35卷 1101-1115页

作者： SCHMIDTNIELSEN, A Naval Research Laboratory Inc. Washington D.C. DC USA

The diagnostic rhyme test (DRT) is widely used to evaluate digital voice systems. Would-be users often have no reference frame for interpreting DRT scores in terms of performance measures that they can understand, e.g., how many operational words are correctly understood. This research was aimed at providing a better understanding of the effects of very poor quality speech on human communication performance. It is especially important to determine how successful communications are likely to be when the speech quality is severely degraded. This paper compares the recognition of ICAO spelling alphabet words (ALFA, BRAVO, CHARLIE, etc.) to DRT scores for the same conditions. Confusions among the spelling alphabet words are also given. The voice conditions included unprocessed speech, speech processed through the DoD standard linear predictive coding algorithm operating at 2400 bits/s with random bit error rates of 0, 2, 5, 8, and 12 percent, and an 800 bit/s pattern matching algorithm. The results suggest that with distinctive vocabularies like the ICAO spelling alphabet, word intelligibility can be expected to remain very high even when DRT scores fall into the poor range; but once the DRT scores fall below about 75, the intelligibility can be expected to fall off rapidly; and at scores below 50, less than half the words will also be understood.

关键词： Narrowband Bit error rate Speech processing Speech coding System testing Humans Degradation linear predictive coding Prediction algorithms Pattern matching

来源：评论

学校读者我要写书评

暂无评论

ON THE APPLICATION OF MIXTURE AR HIDDEN MARKOV-MODELS TO TEXT INDEPENDENT SPEAKER RECOGNITION

引用

IEEE TRANSACTIONS ON SIGNAL PROCESSING 1991年第3期39卷 563-570页

作者： TISHBY, NZ AT&T Bell Lab. Murray Hill NJ USA

linear predictive hidden Markov models have proved to be an efficient way for statistically modeling speech signals. The possible application of such models to statistical characterization of the speaker himself is described and evaluated. The results show that even with a short sequence of only four isolated digits, a speaker can be verified with an average equal-error rate of less than 3%. These results are slightly better than the results obtained using speaker dependent vector quantizers, with comparable numbers of spectral vectors. The small improvement over the vector quantization approach indicates the weakness of the Markovian transition probabilities for characterizing speaker dependent transitional information.

关键词： Hidden Markov models Speaker recognition predictive models Vector quantization Testing Error analysis Speech recognition linear predictive coding Cepstral analysis Distortion measurement

来源：评论

学校读者我要写书评

暂无评论

Hierarchical particle filter for bearings-only tracking

引用

IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS 2007年第4期43卷 1567-1585页

作者： Brehard, T. Le Cadre, J. -P. IRISA CNRS F-35042 Rennes France

We address here the classical bearings-only tracking problem (BOT) for a single target, an issue that belongs to the general class of nonlinear filtering problems. Recently, algorithm-based sequential Monte-Carlo methods (particle filtering) have been proposed. However, Fearnhead has observed that in practice this algorithm diverges. This problem is investigated further here. We show that this phenomenon is due to the unobservability of the distance between the observer and the target. We propose a new algorithm named hierarchical particle filter which takes into account this aspect of the BOT. We demonstrate that this novel filter architecture largely overperforms the classical one. Moreover, these results are confirmed when considering highly maneuvering target scenarios. Finally, we propose a general architecture based on Monte-Carlo methods for filtering initialization, able to accommodate poor prior and complex constraints.

关键词： Particle filters Particle tracking Target tracking Nonlinear equations linear predictive coding Trajectory Time measurement Observability Filtering algorithms Sonar

来源：评论

学校读者我要写书评

暂无评论

Forensic speaker and gender identification from voice samples recorded through mobile phones and social media applications: A statistical and machine learning approach

引用

APPLIED ACOUSTICS 2024年 222卷

作者： Gouri, Gulshan Sharma, Arushi Sharma, Vishal Panjab Univ Inst Forens Sci & Criminol Chandigarh India

The proliferation of the Internet and instant messaging has elevated the use of voice communication, posing challenges in legal and forensic contexts. This study delves into the impact of four social media platforms (WhatsApp, Instagram, Snapchat, and Telegram) on the acoustic properties of vowel sounds. Forty participants, evenly split between 20 males and 20 females, were recorded producing English monophthongs vowels under five conditions: using a mobile phone, WhatsApp, Instagram, Telegram, and Snapchat. Utilizing Multispeech and Praat speech acoustic software, we analyzed formant frequencies, employed statistical F tests, and applied machine learning techniques to assess the influence of social media applications on formant frequency (F1, F2, F3, and F4). The study findings demonstrated that F1, F2, and F3 effectively distinguished vowels, with accuracy rates ranging from 100% to 75% when considering formant frequency. However, in the formant listing, F2, F3, and F4 emerged as reliable markers for identifying vowels, achieving similar accuracy rates. Notably, vowel spaces constructed from mean formant frequencies exhibited distinct patterns, thereby accentuating differences between male and female speakers. Random Forest proved to be the top -performing machine learning approach in gender differentiation, consistently delivering high accuracy rates (ranging from 88% to 93%) and Area Under the Curve (AUC) values between 0.92 and 0.96. Remarkably, Random Forest exhibited strong performance across various algorithms, maintaining its effectiveness despite formant variations.

关键词： Speech linear predictive coding Formant listing Vowel plots Statistical analysis Machine learning

来源：评论

学校读者我要写书评

暂无评论

Lossy pole-zero modeling for speech signals

引用

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING 1996年第2期4卷 81-88页

作者： Lim, IT Lee, BG LG Electronics Research Center Seoul South Korea Department of Electronics Engineering Seoul National University Seoul South Korea

This paper presents an acoustic model-based lossy pole-zero modeling for speech signals, which overcomes the limitation in the existing lossless pole-zero model that forced the numerator part of the pole-zero transfer function to be symmetric, We derive the lossy pole-zero model and its transfer function by employing the wave digital filter (WDF) adaptor formulas and by converting the fixed termination value -1 to a loss factor mu(0)(c) is an element of (-1, 1). Then we discuss how to determine the reflections coefficients of the lossy pole-zero model, For this we first employ a well-performing ARMA modeling algorithm for a pole-zero type estimation of the given speech signal and then fit the transfer function of the lossy pole-zero model to that of the ARMA model under Euclidean cost function. This procedure is demonstrated by an example using the Steiglitz-McBride ARMA estimation method with a synthetic speech signal. The lossy pole-zero modeling yields a new filter structure-namely, three-branch lattice structure-which consists of three lattice branches with the third branch terminated by the loss factor mu(0)(c) is an element of [-1, 1] and with the three branches connected by a three-port wave adaptor characterized by the area ratio sigma is an element of [0, 1]. The three-branch lattice structure is a general filter structure which becomes tbe lossless pole-zero structure when mu(0)(c) = -1 and becomes the existing all-pole lattice structure when sigma = 0.

关键词： Transfer functions Lattices linear predictive coding Digital filters Termination of employment Reflection Cost function Speech analysis Speech synthesis

来源：评论

学校读者我要写书评

暂无评论

Digital underwater acoustic voice communications

引用

IEEE JOURNAL OF OCEANIC ENGINEERING 1996年第2期21卷 181-192页

作者： Woodward, B Sari, H Department of Electronic and Electrical Engineering Loughborough University of Technology Leicestershire UK

This paper describes the design of an underwater acoustic diver communication system controlled by a digital signal processor. The speech signal transmission rate is compressed by using linear predictive coding (LPC) and the extracted parameters are transmitted through the water to a synchronized receiver by employing digital pulse position modulation (DPPM). The pulse position in each time frame is estimated by an energy detection and decision algorithm which enables the received LPC parameters to be recovered and used to synthesize the speech signal.

关键词： Underwater acoustics Underwater communication linear predictive coding Acoustic pulses Pulse modulation Speech synthesis Signal design Digital control Communication system control Control systems

来源：评论

学校读者我要写书评

暂无评论

A VLSI chip for isolated speech recognition system

引用

IEEE TRANSACTIONS ON CONSUMER ELECTRONICS 1996年第3期42卷 458-467页

作者： Kim, SN Hwang, IC Kim, YW Kim, SW Department of Electronics Engineering Korea University Seoul South Korea

A VLSI processor is designed for the small-scale isolated speech recognition applications. It is a dedicated processor which detects endpoint, extracts LPC(linear predictive Coefficient) cepstral coefficients from the speech signal, and computes the spectral distances using a dynamic time warping(DTW) technique. The designed chip can recognize 1000 isolated words per second with an average recognition accuracy of 90.3%. It is designed in a 0.8 mu m CMOS technology, includes 66,760 gates, and runs with a 10MHz clock.

关键词： Very large scale integration Speech recognition CMOS technology Process design linear predictive coding Cepstral analysis Speech processing Signal processing Isolation technology Clocks

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：