检索结果-内蒙古大学图书馆

IEEE International Conference on Consumer Electronics-Taiwan (ICCE-TW)

作者： Hsin-Ju Hsieh Jhih-Hao Jheng Jung-shan Lin Jeih-weih Hung Dept of Electrical Engineering National Chi Nan University Taiwan

ISBN: (纸本)9781509020744

In this paper, we propose adopting the algorithm of linear prediction coding (LPC) to proceeds the temporal feature streams in speech recognition for noise robustness. Using LPC, an FIR filter can be obtained and applied to the time series of Mel-frequency cepstral coefficients (MFCC), and in general the fast-varying component in the modulation spectrum of MFCC can be alleviated accordingly. We have found that the smoothing of MFCC modulation spectrum helps to reduce the noise effect and enhance noise robustness of MFCC. Experiments conducted on the Aurora-2 connected digit database shows that the proposed LPC-wise method improves the recognition accuracy of MVN- and HEQ-preprocessed MFCC under a wide range of noise-corrupted situations.

关键词： time series cepstral analysis feature extraction FIR filters linear predictive coding modulation signal denoising speech recognition

来源：评论

学校读者我要写书评

暂无评论

Resolution Warped Spectral Representation for Low-Delay and Low-Bit-Rate Audio Coder

引用

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING 2015年第2期23卷 288-299页

作者： Sugiura, Ryosuke Kamamoto, Yutaka Harada, Noboru Kameoka, Hirokazu Moriya, Takehiro Univ Tokyo Grad Sch Informat Sci & Technol Tokyo 1138656 Japan NTT Corp NTT Commun Sci Labs Atsugi Kanagawa 2430198 Japan

We have devised a high-quality frequency-domain audio coder based on the state-of-the-art monaural wide-band coder aiming at its use in low-delay and low-bit-rate conditions. The coder efficiently represents frequency spectral envelopes of the target signals with low computational complexity using optimally prepared non-negative sparse matrices. The experimental results reveal that this representation has positive effects on the objective and subjective quality of the coder resulting in the comparable quality to the same bit rate of 3GPP Extended Adaptive Multi-Rate WideBand (AMR-WB+), a coder which permits more than four times longer delay compared with the proposed coder. Consequently, this coder is suitable for applications in mobile communications, which require low delay and low complexity.

关键词： Audio compression frequency warping line spectrum pairs linear predictive coding low delay non-negative matrix transform coding excitation

来源：评论

学校读者我要写书评

暂无评论

A novel approach for artificial bandwidth extension of speech signals by LPC technique over proposed GSM FR NB coder using high band feature extraction and various extension of excitation methods

引用

INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY 2015年第1期18卷 57-64页

作者： Bhatt, Ninad Kosta, Yogeshwar CKPCET EC Dept Surat Gujarat India Marwadi Educ Fdn Rajkot Gujarat India

This research addresses an issue of wide band (WB) speech transmission (having cut-off frequency f(c) = 8 kHz) over standard narrow band (NB) communication link (supporting bandwidth of 300-3,400 Hz). A long transition time for technological up-gradation from NB to WB systems eventually lead to development of backward compatible techniques such as artificial bandwidth extension (ABE) which is capable of providing bandwidth of 50-7,000 Hz, in turn contributing toll quality recovered speech at receiving end. This paper investigates a novel approach to compute high band (HB) features using linear predictive coding (LPC) technique at transmitter from given input WB speech corpus. These encoded features are embedded into bit stream of proposed GSM Full Rate 06.10 NB speech coder using joint source coding and data hiding technique and then transmitted to receiver. At receiver, these HB features are extracted to reproduce HB recovered speech using watermark extraction algorithm and for the same different extension of excitation techniques have been adopted and implemented. An e-test bench is created to implement this proposed ABE coder in MATLAB and series of simulations are carried out using Subjective (mean opinion score-MOS) and Objective (perceptual evaluation of speech quality-PESQ) analysis. Obtained results for both analyses advocate performance improvement of proposed ABE coder over legacy GSM 06.10 FRNB coder for various extension of excitation techniques.

关键词： Proposed GSM Full Rate coder Artificial bandwidth extension Subjective analysis Objective analysis linear predictive coding

来源：评论

学校读者我要写书评

暂无评论

Possibilities of feedforward multilayer neural network classifier as a detector of pest birds in vineyards

引用

INTERNATIONAL JOURNAL OF ENGINEERING RESEARCH IN AFRICA 2015年 18卷 184-191页

作者： Dolezel, Petr Mariska, Martin Taufer, Ivan Univ Pardubice Fac Elect Engn & Informat Pardubice Czech Republic

In this paper, the application of artificial neural network clasifier to resolve pest birds in agricultural areas as a part of a comprehensive system of protection against vermin is demonstrated. Firstly, the idea of the whole system is outlined. Then, the method of recognition is described, the process of artificial neural network design is illustrated and the classifier is validated using data gathered in the fields. Eventually, the results are compared to similar works.

关键词： Artificial neural network signal processing pest birds sound recognition linear predictive coding

来源：评论

学校读者我要写书评

暂无评论

Real-time robust formant estimation system using a phase equalization-based autoregressive exogenous model

引用

ACOUSTICAL SCIENCE AND TECHNOLOGY 2015年第6期36卷 478-488页

作者： Oohashi, Hiroki Hiroya, Sadao Mochida, Takemi Nippon Telegraph & Tel Corp Human Informat Sci Lab NTT Commun Sci Labs 3-1 Morinosato Atsugi Kanagawa 2430198 Japan

This paper presents a real-time robust formant tracking system for speech using a real-time phase equalization-based autoregressive exogenous model (PEAR) with electroglottography (EGG). Although linear predictive coding (LPC) analysis is a popular method for estimating formant frequencies, it is known that the estimation accuracy for speech with high fundamental frequency F-0 would be degraded since the harmonic structure of the glottal source spectrum deviates more from the Gaussian noise assumption in LPC as its F-0 increases. In contrast, PEAR, which employs phase equalization and LPC with an impulse train as the glottal source signals, estimates formant frequencies robustly even for speech with high F-0. However, PEAR requires higher computational complexity than LPC. In this study, to reduce this computational complexity, a novel formulation of PEAR was derived, which enabled us to implement PEAR for a real-time robust formant tracking system. In addition, since PEAR requires timings of glottal closures, a stable detection method using EGG was devised. We developed the real-time system on a digital signal processor and showed that, for both the synthesized and natural vowels, the proposed method can estimate formant frequencies more robustly than LPC against a wider range of F-0.

关键词： Formant estimation Online linear predictive coding Phase equalization

来源：评论

学校读者我要写书评

暂无评论

Automatic characterization and detection of behavioral patterns using linear predictive coding of accelerometer sensor data

Automatic characterization and detection of behavioral patte...

引用

Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)

作者： Cheol-Hong Min Ahmed H. Tewfik Department of Electrical and Computer Engineering University of Minnesota Twin-Cities Minneapolis MN USA

In this study, we target to automatically detect behavioral patterns of patients with autism. Many stereotypical behavioral patterns may hinder their learning ability as a child and patterns such as self-injurious behaviors (SIB) can lead to critical damages or wounds as they tend to repeatedly harm one single location. Our custom designed accelerometer based wearable sensor can be placed at various locations of the body to detect stereotypical self-stimulatory behaviors (stereotypy) and self-injurious behaviors of patients with Autism Spectrum Disorder (ASD). A microphone was used to record sounds so that we may understand the surrounding environment and video provided ground truth for analysis. The analysis was done on four children diagnosed with ASD who showed repeated self-stimulatory behaviors that involve part of the body such as flapping arms, body rocking and self-injurious behaviors such as punching their face, or hitting their legs. The goal of this study is to devise novel algorithms to detect these events and open possibility for design of intervention methods. In this paper, we have shown time domain pattern matching with linear predictive coding (LPC) of data to design detection and classification of these ASD behavioral events. We observe clusters of pole locations from LPC roots to select candidates and apply pattern matching for classification. We also show novel event detection using online dictionary update method. We show that our proposed method achieves recall rate of 95.5% for SIB, 93.5% for flapping, and 95.5% for rocking which is an increase of approximately 5% compared to flapping events detected by using wrist worn sensors in our previous study.

关键词： Autism Dictionaries Accelerometers Classification algorithms Time domain analysis Real time systems linear predictive coding

来源：评论

学校读者我要写书评

暂无评论

LOCALIZATION OF ACOUSTIC SOURCE ON SOLIDS: A linear predictive coding BASED ALGORITHM FOR LOCATION TEMPLATE MATCHING

LOCALIZATION OF ACOUSTIC SOURCE ON SOLIDS: A LINEAR PREDICTI...

引用

IEEE International Conference on Acoustics, Speech, and Signal Processing

作者： XueXin Yap Andy W.H. Khong Woon-Seng Gan Nanyang Technological University Singapore

ISBN: (纸本)9781424442959

Location template matching (LTM) is a source localization technique in solids that is robust to dispersion and multipath. This is possible since LTM compares the input with a database of signals made at known locations. With this in place, it is possible to employ LTM in situations where the surface of interest takes an irregular shape. However, one of the existing LTM approaches uses cross-correlation to compare the input and the database. It should be noted that if any two of the known locations stored in the database are too close, the cross-correlation method may have difficulties differentiating between signals generated from the neighboring points. To address this, we propose an algorithm which employs the linear predictive coding (LPC) that takes into account the dominant frequencies of a received signal. Using this approach, we show that the proposed algorithm is able to improve LTM's source localization accuracy under a real environment in the context of source localization for a touch interface.

关键词： user interfaces correlation linear predictive coding location template matching

来源：评论

学校读者我要写书评

暂无评论

Adaptive selection of lag-window shape for linear predictive analysis in the 3GPP EVS codec

Adaptive selection of lag-window shape for linear predictive...

引用

3rd IEEE Global Conference on Signal and Information Processing (GlobalSIP)

作者： Kamamoto, Yutaka Moriya, Takehiro Harada, Noboru NTT Corp NTT Commun Sci Labs Atsugi Kanagawa Japan

ISBN: (纸本)9781479975914

Lag windowing has long been used for the auto-correlation method of linear predictive (LP) analysis to prevent possible instability of the synthesis filter with the obtained coefficients. We have investigated the lag-window shape in terms of the trade-offs between stability and the coding efficiency. On the basis of these investigations, we have devised an adaptive selection scheme in which the window shape selected depends on the periodicity of the signal. This scheme has proven to be effective for LP analysis to enhance the coding efficiency in both time and frequency domains in general. This scheme has thus been included in the speech and audio coding schemes of the newly established 3GPP EVS codec standard.

关键词： lag-window linear predictive coding fundamental frequency pitch-gain 3GPP EVS codec

来源：评论

学校读者我要写书评

暂无评论

Singing Voice Identification Using Harmonic Spectral Envelope

Singing Voice Identification Using Harmonic Spectral Envelop...

引用

IEEE International Conference on Information Processing (ICIP)

作者： Loni, Deepali Yoginath Subbaraman, Shaila DKTE Text & Engn Inst Dept Elect Ichalkaranji India Walchand Coll Engn Sangli India

ISBN: (纸本)9781467377584

The paper presents a novel approach to identify the singers using harmonic spectral envelope constructed from pitch of singing voice. This new representation of singing voice demonstrates that harmonic spectral envelope exhibits certain acoustic qualities that can characterize the identity of the singer. Two different approaches are implemented to extract the pitch of singing voice;Cepstrum technique and linear predictive coding. Ten singers comprising of six male and four female singers are analyzed in this work. To have accurate analysis and estimation of acoustics of singing voice only cappella sections are investigated. Along with discussion on singer identification, the results include comparison of pitch extraction techniques and gender identification of singer. We achieve an average accuracy of 77% in identifying the singers, covering a large class of polyphonic recordings of Indian movie songs.

关键词： Cappella Cepstrunm linear predictive coding Pitch Singer Singing Voice Indian songs

来源：评论

学校读者我要写书评

暂无评论

Real Time Implementation of MELP Speech Compression Algorithm using Blackfin Processors 9

Real Time Implementation of MELP Speech Compression Algorith...

引用

9th International Symposium on Image and Signal Processing and Analysis (ISPA)

作者： Duta, Cristina-Loredana Gheorghe, Laura Tapus, Nicolae Univ Politehn Bucuresti Dept Comp Sci & Engn Bucharest Romania

ISBN: (纸本)9781467380324

A large part of the latest research in speech coding algorithms is motivated by the need of obtaining secure military communications, to allow effective operation in a hostile environment. Since the bandwidth of the communication channel is a sensitive problem in military applications, low bit-rate speech compression methods are mostly used. Several speech processing applications such as Mixed Excitation linear Prediction are characterized by very strict requirements in power consumption, size, and voltage supply. These requirements are difficult to fulfill, given the complexity and number of functions to be implemented, together with the real time requirement and large dynamic range of the input signals. To meet these constraints, careful optimization should be done at all levels, ranging from algorithmic level, through system and circuit architecture, to layout and design of the cell library. The key points of this optimization are among others, the choice of the algorithms, the modification of the algorithms to reduce computational complexity, the choice of a fixed-point arithmetic unit, the minimization of the number of bits required at every node of the algorithm, and a careful match between algorithms and architecture. This paper concentrates on low bit rate speech coding technology, mainly in MELP and solved the problem of optimizing the program of MELP on Digital Signal Processor platform. The algorithm was ported onto a fixed point DSP, Blackfin 537, and stage by stage optimization was performed to meet the real time requirements. The main functions involved were analysis, parameter encoding, parameter decoding and synthesis. The fixed point source code at the MELP front end was also thoroughly optimized at the C Level. Memory optimization techniques such as data placement and caching were also used to reduce the processing time. The results we obtained show that real-time implementations of a speech vocoder based on the MELP standard for low bit rate commu

关键词： speech signal speech processing linear predictive coding digital signal processor Blackfin processor

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：