This paper proposes a novel robust fundamental frequency (F0) estimation algorithm based on complex-valued speech analysis for an analytic speech signal. Since analytic signal provides spectra only over positive frequ...
详细信息
ISBN:
(纸本)1424405343
This paper proposes a novel robust fundamental frequency (F0) estimation algorithm based on complex-valued speech analysis for an analytic speech signal. Since analytic signal provides spectra only over positive frequencies, spectra can be accurately estimated in low frequencies. Consequently, it is considered that F0 estimation using the residual signal extracted by complex-valued speech analysis can perform better for F0 estimation than that for the residual signal extracted by conventional real-valued LPC analysis. In this paper, the autocorrelation function weighted by AMDF is adopted for the F0 estimation criterion and four signals; speech signal, analytic speech signal, LPC residual and complex LPC residual, are evaluated for the F0 estimation. Speech signals used in the experiments were corrupted by adding white Gaussian noise whose noise levels are 10, 5, 0, -5 [dB]. The experimental results demonstrate that the proposed algorithm based on complex speech analysis can perform better than other methods in an extremely noisy environment
Voice biometrics is an economic method of person authentication with the help of machines because of low cost and high power computers. In this paper, we investigate the problem of spectral resolution in female speech...
详细信息
Voice biometrics is an economic method of person authentication with the help of machines because of low cost and high power computers. In this paper, we investigate the problem of spectral resolution in female speech for speaker identification. Finally, a speaker recognition system is presented to compare the relative performance of different LP-based features such as (LPC and LPCC) and filterbank-based features such as Mel-frequency cepstral coefficients (MFCC) for identification of female speakers. The results are shown for database collected from 15 female speakers in Bengali.
This paper is about the reduction of the computational complexity of a speech codec. A linear predictive coding procedure is developed to allow its implementation with number theoretic transforms. The use of fermat nu...
详细信息
This paper is about the reduction of the computational complexity of a speech codec. A linear predictive coding procedure is developed to allow its implementation with number theoretic transforms. The use of fermat number transform can reduce, in a significant way, the cost of linearpredictive algorithm implantation on digital signal processor.
As wireless systems evolve toward supporting a wide array of services, including traditional voice service, using packet-switched transport, it becomes increasingly important to assess the impact of packet-switched tr...
详细信息
As wireless systems evolve toward supporting a wide array of services, including traditional voice service, using packet-switched transport, it becomes increasingly important to assess the impact of packet-switched transport protocols on voice quality. In this article we present a tutorial on voice quality evaluation for wireless packet-switched systems. We introduce an evaluation methodology that combines elementary objective voice quality metrics with a frame synchronization mechanism. ne methodology allows networking researchers to conduct effective and accurate quality evaluation of packet voice. To illustrate the use of the described evaluation methodology and interpretation of the results, we conduct a case study of the impact of robust header compression (ROHC) on the voice quality achieved with real-time transmission of GSM encoded voice over a wireless link.
This paper brings light on the digital signal processing (DSP) roots of a modern concept, voice over IP (VoIP). An example is also provided in which developments in DSP - speech coding, in particular - had a profound ...
详细信息
This paper brings light on the digital signal processing (DSP) roots of a modern concept, voice over IP (VoIP). An example is also provided in which developments in DSP - speech coding, in particular - had a profound impact on the early development of the ARPANET, the ancestor of the Internet. The author shows how packet speech, recently rediscovered and made popular as VoIP, was first successfully demonstrated in 1974 on the ARPANET and how the Internet protocol (IP) emerged largely as a result of that effort.
This paper describes a novel algorithm for transforming linear prediction coefficients (LPCs) to line spectral frequencies (LSFs) and line spectral pairs (LSPs) used by most of the speech processing applications. The ...
详细信息
This paper describes a novel algorithm for transforming linear prediction coefficients (LPCs) to line spectral frequencies (LSFs) and line spectral pairs (LSPs) used by most of the speech processing applications. The symmetric and antisymmetric polynomials (SAPS) for LSP/LSFs, corresponding to the LPC polynomial, are first multiplexed into a single real sequence. The required samples of SAPS correspond to the DFT of the obtained real sequence. The proposed algorithm is referred as PMLS as it is based on the Plus Minus (PM) algorithm an FFT which efficiently computes the DFT of a real sequence for the positive frequency interval only. The samples of the SAPS are efficiently utilized for the computation of a single parameter which is used for computation of LSF and LSP independently with some interpolation principles. This interpolation exploits the available samples of SAPS and does not require their samples at finer resolution. The efficiency of the PMLS is illustrated with the help of some examples. Some guidelines for an optimal implementation of PMLS on fixed point digital signal processors (DSPs) are also presented.
We present a new low-complexity algorithm for hyperspectral image compression that uses linear prediction in the spectral domain. We introduce a simple heuristic to estimate the performance of the linear predictor fro...
详细信息
We present a new low-complexity algorithm for hyperspectral image compression that uses linear prediction in the spectral domain. We introduce a simple heuristic to estimate the performance of the linear predictor from a pixel spatial context and a context modeling mechanism with one-band look-ahead capability, which improves the overall compression with marginal usage of additional memory. The proposed method is suitable to spacecraft on-board implementation, where limited hardware and low power consumption are key requirements. Finally, we present a least-squares optimized linear prediction technique that achieves better compression on data cubes acquired by the NASA JPL Airborne Visible/Infrared Imaging Spectrometer (AVIRIS).
This paper describes the pioneering research in the field of speech technology by James L. Flanagan, 2005 IEEE Medal of Honor awardee. Flanagan's work with speech coding heralded a series of advances over the year...
详细信息
This paper describes the pioneering research in the field of speech technology by James L. Flanagan, 2005 IEEE Medal of Honor awardee. Flanagan's work with speech coding heralded a series of advances over the years, including a currently favored technique, linear predictive coding. After graduating from Mississippi State as an electrical engineering major, Flanagan accepted a graduate assistantship in MIT's acoustics lab, which led to his seminal research in voice coding. Flanagan then worked at Bell Telephone Laboratories where he would spend the next 33 years. He climbed steadily up the ranks at Bell Labs, eventually becoming director of the Information Principles Research Laboratory. Among the projects that Flanagan was deeply involved in were the development of automatic speech recognition systems, voice mail, artificial larynx, and packet-switched voice technology.
This paper investigates a normalized cross-correlation-based doubletalk detector for acoustic echo cancellers. A novel low-complexity version is presented suitable for implementation on IP-enabled telephones employing...
详细信息
ISBN:
(纸本)0780391543
This paper investigates a normalized cross-correlation-based doubletalk detector for acoustic echo cancellers. A novel low-complexity version is presented suitable for implementation on IP-enabled telephones employing LPC-based speech coders. In particular. the algorithm obtains a decorrelated input signal and decorrelation filter coefficients directly from the speech decoder. The proposed algorithm is implemented using ITU G.729. and simulation results are collected for a typical room. Calibration data are obtained and shown to be independent of the speech input signal. It is also shown that the proposed algorithm has the same performance as a full-complexity version employing the same decorrelation filter order.
In this paper, we propose a new blind speech authentication method to verify the authenticity and integrity of speech. In this method, linear predictive coding (LPC) and least significant bit (LSB) steganography are c...
详细信息
In this paper, we propose a new blind speech authentication method to verify the authenticity and integrity of speech. In this method, linear predictive coding (LPC) and least significant bit (LSB) steganography are combined, using LPC prediction error (LPCPE) as invariable features of speech and inserting these features into least significant bits of speech samples. When verifying the speech, we extract the inserted features, and compare them with recalculated LPCPE. Experimental results show that this method can detect not only the tampering in speech but also the position and region of the tampering
暂无评论