This paper proposes a novel robust fundamental frequency (F0) estimation algorithm based on complex-valued speech analysis for an analytic speech signal. Since analytic signal provides spectra only over positive frequ...
详细信息
ISBN:
(纸本)1424405343
This paper proposes a novel robust fundamental frequency (F0) estimation algorithm based on complex-valued speech analysis for an analytic speech signal. Since analytic signal provides spectra only over positive frequencies, spectra can be accurately estimated in low frequencies. Consequently, it is considered that F0 estimation using the residual signal extracted by complex-valued speech analysis can perform better for F0 estimation than that for the residual signal extracted by conventional real-valued LPC analysis. In this paper, the autocorrelation function weighted by AMDF is adopted for the F0 estimation criterion and four signals; speech signal, analytic speech signal, LPC residual and complex LPC residual, are evaluated for the F0 estimation. Speech signals used in the experiments were corrupted by adding white Gaussian noise whose noise levels are 10, 5, 0, -5 [dB]. The experimental results demonstrate that the proposed algorithm based on complex speech analysis can perform better than other methods in an extremely noisy environment
Voice over IP (VoIP) can be used in a wide variety of applications, all having different requirements. We present JVOIPLIB and JRTPLIB, a VoIP library and an RTP library respectively. Together they make it possible to...
详细信息
ISBN:
(纸本)0769513212
Voice over IP (VoIP) can be used in a wide variety of applications, all having different requirements. We present JVOIPLIB and JRTPLIB, a VoIP library and an RTP library respectively. Together they make it possible to easily add VoIP to various types of applications. Both libraries are written in an object-oriented style in C++, are open-source and are both very extensible. Several measures have been taken to allow good synchronization between the communicating parties.
The transition from a conventional delivery system to a suitable robust distribution system is emerging to adapt with different possible scenarios of future development. This paper gives a particular vision of future ...
详细信息
The transition from a conventional delivery system to a suitable robust distribution system is emerging to adapt with different possible scenarios of future development. This paper gives a particular vision of future grid with its main requirements. An investigation of suitable concepts and technologies which draw out attentions at the present has been carried out. They are discussed regarding mentioned requirements of sustainability, efficiency, flexibility and intelligence. Active network is then introduced as the backbone of the future power delivery system. Besides, multi- agent system (MAS) is described as a potential technology to cope with anticipated challenges of future grid operation. The research described is under the framework of the Electricity Infrastructure of the Future (EIT) project carried out by cooperation of TU/e, KEMA and ECN (***).
The paper describes a comparison of a C implementation of a linearpredictive voice coder (LPC) and an implementation based on Spectron Microsystem's Signal Processing Operating System (SPOX). The hardware platfor...
详细信息
The paper describes a comparison of a C implementation of a linearpredictive voice coder (LPC) and an implementation based on Spectron Microsystem's Signal Processing Operating System (SPOX). The hardware platform was a Texas Instruments TMS320C30 Evaluation Module. The SPOX and C implementations were compared based on execution time, ease of program development and maintenance, and portability to different hardware platforms. The vocoder algorithms and the results of the comparison of both implementations are presented.
In this paper, the main target is to compare the system lifetimes in diverse scenarios based on our new energy dissipation model in the body sensor network as well as maximize the system lifetime of sensor nodes when ...
详细信息
In this paper, the main target is to compare the system lifetimes in diverse scenarios based on our new energy dissipation model in the body sensor network as well as maximize the system lifetime of sensor nodes when they will make communication among body sensors and personal communication unit. Nowadays, the ultra low energy consumption is the very much important challenge for medical applications. The best compression technique like LPC is selected for energy saving based on some calculations. This research work studies and analyzes the transceiver energy consumption for different compression algorithms and select the best technique like LPC for energy saving. Formulation of a linear programming problem is also the important part of this research work, where is to maximize the system lifetime which is equivalent to the time until the first node runs out of battery. Maximum system lifetimes are calculated by MATLAB optimization technique using and without using efficient compression algorithm like LPC in various environments. Results show that maximum system lifetimes calculated in different scenarios using efficient compression technique like LPC is better than without using compression technique.
A two-sided linear prediction (TSLP) model is shown to have high prediction gain over the conventional linear prediction (LPC) model [David and Ramamurthi, 1991], while it requires fewer coefficients in modeling. Unfo...
详细信息
A two-sided linear prediction (TSLP) model is shown to have high prediction gain over the conventional linear prediction (LPC) model [David and Ramamurthi, 1991], while it requires fewer coefficients in modeling. Unfortunately, speech synthesis cannot use the TSLP model directly because it needs future samples which are not available in the process. Autoregressive spectral matching (ARSM) is proposed to render the TSLP model suitable for speech synthesis. Vector sum excitation method is used to generate the excitation to the new model and its performance is comparable to the standard VSELP.< >
This paper describes a speaker-independent isolated word recognition algorithm for telephone voice and its recognition performance. The recognition algorithm consists of two processes ; dynamic time warping and statis...
详细信息
This paper describes a speaker-independent isolated word recognition algorithm for telephone voice and its recognition performance. The recognition algorithm consists of two processes ; dynamic time warping and statistical word discrimination. In the first process, input speech is compared with each word template using the dynamic time warping technique. Multiple word templates are used to deal with speech variations among speakers, where each word template is represented by a sequence of phoneme-like templates. To attain high recognition ability, a new technique for generating word templates is proposed. In the second process, statistical word discrimination is carried out for word candidates which have relatively low reliability in the first process. Discrimination functions are calculated based on statistics of transition tendencies of speech characteristics between adjacent frames, and the final word decision is made. The system was trained using utterances from 1305 speakers and tested with utterances from 259 speakers. The average recognition rate of 96.5% was obtained for a 16-word Japanese vocabulary set.
Line spectrum pair (LSP) representation of linear predictive coding (LPC) parameters is widely used in speech coding applications. An efficient method for LPC to LSP conversion is Kabal's method. In this method th...
详细信息
Line spectrum pair (LSP) representation of linear predictive coding (LPC) parameters is widely used in speech coding applications. An efficient method for LPC to LSP conversion is Kabal's method. In this method the LSPs are the roots of two polynomials P'/sub p/(x) and Q'/sub p/(x), and are found by a zero crossing search followed by successive bisections and interpolation. The precision of the obtained LSPs is higher than required by most applications, but the number of bisections cannot be decreased without compromising the zero crossing search. In this paper, it is shown that, in the case of 10th-order LPC, five intervals containing each only one zero crossing of P'/sub 10/(x) and one zero crossing of Q'/sub 10/(x) can be calculated, avoiding the zero crossing search. This allows a trade-off between LSP precision and computational complexity resulting in considerable computational saving.
Several pre-processing algorithms modify the residual speech signal to facilitate efficient estimation of speech model parameters. This, however, can result in misalignment between the modified residual signal and the...
详细信息
Several pre-processing algorithms modify the residual speech signal to facilitate efficient estimation of speech model parameters. This, however, can result in misalignment between the modified residual signal and the time-variant linear prediction (LP) filter used during the synthesis stage. The resulting misalignment may cause audible artifacts particularly at onsets when the frequency response of successive LP filters changes rapidly. We propose a new solution to control the LP filter gain at subframes. This technique is performed before and after time modification of speech and therefore is called preanalysis and post-processing. A pitch smoothing technique is used to illustrate the effect of the proposed technique
暂无评论