This paper describes how the selection of parameters for the variance fractal dimension (VFD) multiscale time-domain algorithm can create an amplification of the fractal dimension trajectory that is obtained for a nat...
详细信息
This paper describes how the selection of parameters for the variance fractal dimension (VFD) multiscale time-domain algorithm can create an amplification of the fractal dimension trajectory that is obtained for a natural-speech waveform in the presence of ambient noise. The technique is based on the variance fractal dimension trajectory (VFDT) algorithm that is used not only to detect the external boundaries of an utterance, but also its internal pauses representing the unvoiced speech. The VFDT algorithm can also amplify internal features of phonemes. This fractal feature amplification is accomplished when the time increments are selected in a dyadic manner rather than selecting the increments in a unit distance sequence. These amplified trajectories for different phonemes are more distinct, thus providing a better characterization of the individual segments in the speech signal. This approach is superior to other energy-based boundary-detection techniques. These observations are based on extensive experimental results on speech utterances digitized at 44.1 kilosamples per second, with 16 bits in each sample.
This paper proposed a new algorithm named multi-twin support vector machines (MTSVM). At the same time, its application in speaker recognition was studied. The MTSVM tried to find nonparallel plane for every class whi...
详细信息
This paper proposed a new algorithm named multi-twin support vector machines (MTSVM). At the same time, its application in speaker recognition was studied. The MTSVM tried to find nonparallel plane for every class which the data in the same class are closer to, and the data in the other classes are as far as possible. The MTSVM is different from the normal one-to-all multi-class twin support vector machines (TSVM) where the constrains from other classes are distributed in one quadratic programming problem (QPP). However, in MTSVM, the constraint from every other class is acted on the QPP separately. The feasibility and validity of MTSVM in artificial data and Chains Corpus for speaker recognition are showed in a series of experiments.
The project ASAROME (autonomous sailing robot for oceanographic measurements) is working on a small autonomous sailboat in order to make measurements and observations in the marine environment for long periods. In thi...
详细信息
The project ASAROME (autonomous sailing robot for oceanographic measurements) is working on a small autonomous sailboat in order to make measurements and observations in the marine environment for long periods. In this project, perception plays an important role by giving an estimate of the speed of surface winds, the state of the sea surface and the rate of precipitation in wet weather. In this paper, the unknown signals are first encoded with different codes (ERB, MFCC, LPC, LPCC). Then the coded signals are modeled by two different methods of classification: predictive and k-nearest neighbor. The final part of the system uses local and global decision to recognize the class of the unknown signal. Experiments are conducted to compare the results obtained by different encodings. Our results show that MFCC does not represent the ideal approach for the recognition of underwater audio signals, but LPCC seems to be a better candidate.
Conventional statistical single-channel noise reduction methods suffer from bad performance in highly non-stationary environments. In contrast to that, model-based algorithms have the potential to deal with those adve...
详细信息
ISBN:
(纸本)9781424456499
Conventional statistical single-channel noise reduction methods suffer from bad performance in highly non-stationary environments. In contrast to that, model-based algorithms have the potential to deal with those adverse conditions. In this paper, we focus on codebook-based algorithms which utilize trained codebooks where typical speech and noise spectral shapes are stored. Speech and noise estimates are determined frame for frame independently which allows to deal with highly non-stationary noise. By incorporating memory, the performance can be further improved. In this paper, elaborated models for memory modeling are presented and a preliminary validation is provided.
In this paper a new robust feature extraction method for speech recognition, has been proposed. The features are obtained from Cepstral Mean Normalized reduced order linear predictive coding (LPC) coefficients derived...
详细信息
ISBN:
(纸本)9781605583518
In this paper a new robust feature extraction method for speech recognition, has been proposed. The features are obtained from Cepstral Mean Normalized reduced order linear predictive coding (LPC) coefficients derived from the speech frames decomposed using Discrete Wavelet Transform (DWT). In the literature it is assumed that the speech frame of size 10 msec to 30 msec is stationary, however, in practice different parts of the speech signal may convey different amount of information (hence may not be perfectly stationary). LPC coefficients derived from wavelet decomposed subbands of speech frame provide better representation than modeling the frame directly. Experimentally it has been shown that, the proposed approach provides effective (better recognition rate), efficient (reduced feature vector dimension) and robust features. The speech recognition system using the Continuous Density Hidden Markov Model (CDHMM) has been implemented. The proposed algorithm is evaluated using NIST TI-46 isolated-word database.
In this paper wavelet transform (WT) in its two forms continuous and discrete are used to create text-dependent robust to noise speaker recognition system. The research intends to investigate a high accuracy of identi...
详细信息
In this paper wavelet transform (WT) in its two forms continuous and discrete are used to create text-dependent robust to noise speaker recognition system. The research intends to investigate a high accuracy of identification the speech signal of very difficult nature that is non- stationary. Three methods are used to extract the essential speaker features based on continuous, discrete wavelet transform and linear prediction coefficient (LPC). To have better identification rate three measurement methods are used: percentage rms difference (PRD), correlation coefficient (CC), and statically deformation determination coefficient (SDDC). 95% identification rate is accomplished. The presented system in this paper depends on multi-stage features extracting due to its better accuracy. The system works with excellent capability of features tracking even when the tested signals are very noisy with -32dB SNR. This is accomplished because of multistage features tracking based system using wavelet transform, which is suitable for non-stationary signal.
This paper describes the scan test challenges and techniques used in the Godson-3 microprocessor, which is a scalable multicore processor based on the SMOC (scalable mesh of crossbar) on-chip network and targets high-...
详细信息
This paper describes the scan test challenges and techniques used in the Godson-3 microprocessor, which is a scalable multicore processor based on the SMOC (scalable mesh of crossbar) on-chip network and targets high-end applications. Advanced techniques are adopted to achieve the scalable, low-power and low-cost scan architecture at the challenge of limited I/O resources and large scale of transistors. To achieve a scalable and flexible test access, a highly elaborate TAM (test access mechanism) is implemented with supporting multiple test instructions and test modes. Taking advantage of multiple cores embedding in the processor, scan partitions are employed to reduce test power and test time, and test compression with more than 10X compression ratio are utilized to decrease the scan chain length. To further decrease test time, a data-synchronous-comparator (DSC) is proposed for comparing the scan responses of the identical cores.
In Malaysia, Tenaga Nasional Berhad (TNB) as the power utility company have two methods to take the metering data from their customer. For their ordinary customers (OPC, ordinary power customer), they used conventiona...
详细信息
In Malaysia, Tenaga Nasional Berhad (TNB) as the power utility company have two methods to take the metering data from their customer. For their ordinary customers (OPC, ordinary power customer), they used conventional method, by sending meter-readers on monthly basis to do meter reading from meters installed at their customers premises. For large or industrial customers (LPC, large power customer), they used global system for mobile communication-based (GSM) AMR technology. Here GSM module is attached to each LPC's meter, and data reading can be done automatically without visiting their customer's sites. For OPC customers, conventional method is facing some issues that need to be improved. In this paper we described the propose hybrid automatic meter reading (AMR) system which is a combination of ZigBee and GSM technology. In this propose system, ZigBee module will be attached to the meter by using interface board and the data collector will be connected to the central computer by using GSM. The system is suitable with Malaysian condition which already implemented GSM-based AMR in LPC. With this system TNB can save cost in doing meter reading and provide better services to their customers.
Configurable coprocessors have been an active area for some time. The limitation of word length of instruction set and the number of operands in a single instruction have become a potential performance bottleneck for ...
详细信息
Configurable coprocessors have been an active area for some time. The limitation of word length of instruction set and the number of operands in a single instruction have become a potential performance bottleneck for traditional SIMD extension. In this paper, we use LEON-2 as the host platform and present a novel low-cost architecture with extended shadow_f registers. In each extended instruction, some shadow_f registers are introduced to provide a copy of results received in the writeback stage, which can efficiently reduce the time of data transfer between LEON-2 and the coprocessor. Analysis of our proposed architecture shows that only partial replication of the whole register file is needed to mitigate the bandwidth limitation. At the same time, the proposed vector arithmetic unit is proved to be highly compatible to the required calculation patterns in the integer version of MELP algorithm. The application of our approach implemented on Stratix II FPGA show a promising speedup (up to 3.85X to some dominant kernels) with only 16% area increment.
Voice conversion algorithm aims to provide high level of similarity to the target voice with an acceptable level of quality. The main object of this paper was to build a nonlinear relationship between the parameters f...
详细信息
Voice conversion algorithm aims to provide high level of similarity to the target voice with an acceptable level of quality. The main object of this paper was to build a nonlinear relationship between the parameters for the acoustical features of source and target speaker using non-linear canonical correlation analysis (NLCCA) based on jointed Gaussian mixture model. Speaker individuality transformation was achieved mainly by altering vocal tract characteristics represented by line spectral frequencies (LSF). To obtain the transformed speech sounded more like the target voices, prosody modification is involved through residual prediction. Both objective and subjective evaluations were conducted. The experimental results demonstrated that our proposed algorithm was effective and outperformed the conventional conversion method utilized by the minimum mean square error (MMSE) estimation.
暂无评论