A classifier for utterance rejection in a hidden Markov model (HMM) based speech recognizer is presented. This classifier, termed the two-pass classifier, is a postprocessor to the HMM recognizer, and consists of a tw...
详细信息
A classifier for utterance rejection in a hidden Markov model (HMM) based speech recognizer is presented. This classifier, termed the two-pass classifier, is a postprocessor to the HMM recognizer, and consists of a two-stage discriminant analysis. The first stage employs the generalized probabilistic descent (GPD) discriminative training framework, while the second stage performs linear discrimination combining the output of the first stage with HMM likelihood scores. In this fashion the classification power of the HMM is combined with that of the GPD stage which is specifically designed for keyword/nonkeyword classification. Experimental results show that, on two separate databases, the two-pass classifier significantly outperforms a single-pass classifier based solely on the HMM likelihood scores.< >
In this paper, the main target is to compare the system lifetimes in diverse scenarios based on our new energy dissipation model in the body sensor network as well as maximize the system lifetime of sensor nodes when ...
详细信息
In this paper, the main target is to compare the system lifetimes in diverse scenarios based on our new energy dissipation model in the body sensor network as well as maximize the system lifetime of sensor nodes when they will make communication among body sensors and personal communication unit. Nowadays, the ultra low energy consumption is the very much important challenge for medical applications. The best compression technique like LPC is selected for energy saving based on some calculations. This research work studies and analyzes the transceiver energy consumption for different compression algorithms and select the best technique like LPC for energy saving. Formulation of a linear programming problem is also the important part of this research work, where is to maximize the system lifetime which is equivalent to the time until the first node runs out of battery. Maximum system lifetimes are calculated by MATLAB optimization technique using and without using efficient compression algorithm like LPC in various environments. Results show that maximum system lifetimes calculated in different scenarios using efficient compression technique like LPC is better than without using compression technique.
The transition from a conventional delivery system to a suitable robust distribution system is emerging to adapt with different possible scenarios of future development. This paper gives a particular vision of future ...
详细信息
The transition from a conventional delivery system to a suitable robust distribution system is emerging to adapt with different possible scenarios of future development. This paper gives a particular vision of future grid with its main requirements. An investigation of suitable concepts and technologies which draw out attentions at the present has been carried out. They are discussed regarding mentioned requirements of sustainability, efficiency, flexibility and intelligence. Active network is then introduced as the backbone of the future power delivery system. Besides, multi- agent system (MAS) is described as a potential technology to cope with anticipated challenges of future grid operation. The research described is under the framework of the Electricity Infrastructure of the Future (EIT) project carried out by cooperation of TU/e, KEMA and ECN (***).
An algorithm is presented for isolated-word recognition, taking into consideration the duration variability of the different utterances of the same word. The algorithm is based on extracting acoustical features from t...
详细信息
An algorithm is presented for isolated-word recognition, taking into consideration the duration variability of the different utterances of the same word. The algorithm is based on extracting acoustical features from the speech signal and using them as the input to multilayer perceptrons neural networks. The backpropagation algorithm is used to train the networks. The hidden Markov model (HMM) is implemented to extract temporal features (states) from the speech signal. The input vector to the network consists of 16 cepstral coefficients, two delta cepstral coefficients, and five elements to represent the state. The networks are trained to recognize the correct words and to reject the wrong words. The training set consists of ten words (digit zero to digit nine), each uttered seven times, by three different speakers. The test set consists of three utterances of each of the ten words. The authors' results show the ability to recognize all of these words.< >
A new approach to temporal decomposition (TD) of speech, called "spectral stability based event localizing temporal decomposition", abbreviated S/sup 2/ BEL-TD, is presented. The original method of TD propos...
详细信息
A new approach to temporal decomposition (TD) of speech, called "spectral stability based event localizing temporal decomposition", abbreviated S/sup 2/ BEL-TD, is presented. The original method of TD proposed by Atal (1983) is known to have the drawbacks of high computational cost, and the instability of the number and locations of events. In S/sup 2/ BEL-TD, the event localization is performed based on a maximum spectral stability criterion. This overcomes the instability problem of events of the Atal's method. Also, S/sup 2/ BEL-TD avoids the use of the computationally costly singular value decomposition routine used in the Atal's method, thus resulting in a computationally simpler algorithm of TD. Simulation results show that an average spectral distortion of about 1.5 dB can be achieved with LSF as the spectral parameter. Also, we have shown that the temporal pattern of the speech excitation parameters can also be well described using the S/sup 2/ BEL-TD technique.
In the paper problems related to the classification of singing voice quality are presented. For this purpose a database consisting of singers' sample recordings is constructed and parameters are extracted from rec...
详细信息
In the paper problems related to the classification of singing voice quality are presented. For this purpose a database consisting of singers' sample recordings is constructed and parameters are extracted from recorded voice of trained and untrained singers. The parameterization process is based on both voice source and formant analysis of a singing voice. These parameters are explained as to their physical interpretation and analyzed statistically in order to diminish their number. The statistical analysis is based on the Fisher statistic. In such a way a feature vector of a singing voice is formed. Decision systems based on neutral networks and rough sets are utilized in the context of the voice type and voice quality classification. Results obtained in the automatic classification performed by both decision systems are compared. A possibility to classify automatically type/quality of voice is judged. The methodology proposed provides means for discerning trained and untrained singers.
Voice over IP (VoIP) can be used in a wide variety of applications, all having different requirements. We present JVOIPLIB and JRTPLIB, a VoIP library and an RTP library respectively. Together they make it possible to...
详细信息
ISBN:
(纸本)0769513212
Voice over IP (VoIP) can be used in a wide variety of applications, all having different requirements. We present JVOIPLIB and JRTPLIB, a VoIP library and an RTP library respectively. Together they make it possible to easily add VoIP to various types of applications. Both libraries are written in an object-oriented style in C++, are open-source and are both very extensible. Several measures have been taken to allow good synchronization between the communicating parties.
In high dimensional feature space with finite samples, severe bias can be introduced in the nearest neighbor algorithm. In this paper, we propose a new classification method, which performs classification task based o...
详细信息
ISBN:
(纸本)0769525210
In high dimensional feature space with finite samples, severe bias can be introduced in the nearest neighbor algorithm. In this paper, we propose a new classification method, which performs classification task based on local probability center of each class. Moreover, this prototype-based method classifies the query sample by using two measures, one is the distance between query and local probability centers, the other is the posterior probability of query. Although both measures are effect, the experiments show the second one is the better. The investigation results prove that this method improves the classification performance of nearest neighbor algorithm substantially
The paper describes a comparison of a C implementation of a linearpredictive voice coder (LPC) and an implementation based on Spectron Microsystem's Signal Processing Operating System (SPOX). The hardware platfor...
详细信息
The paper describes a comparison of a C implementation of a linearpredictive voice coder (LPC) and an implementation based on Spectron Microsystem's Signal Processing Operating System (SPOX). The hardware platform was a Texas Instruments TMS320C30 Evaluation Module. The SPOX and C implementations were compared based on execution time, ease of program development and maintenance, and portability to different hardware platforms. The vocoder algorithms and the results of the comparison of both implementations are presented.
暂无评论