A commentary channel, describing the main features of a scene on the television screen, using some spare capacity in the NICAM and Teletext transmission systems, will be introduced in Europe for the benefit of partial...
详细信息
A commentary channel, describing the main features of a scene on the television screen, using some spare capacity in the NICAM and Teletext transmission systems, will be introduced in Europe for the benefit of partially sighted people who are mainly elderly. The speech coding algorithm for this application has to cope with a number of constraints. The authors describe several speech coding standards which cover a wide range of bit rates with varying complexity and quality. The best quality speech coder is based on code excited linear prediction (CELP), in which the excitation is modelled as the sum of pitch modelling and a random stochastic process. A derivative of this coder has been selected to be optimized for the AUDETEL application. A detailed description of the speech coding algorithm together with its real-time implementation and hardware details are explained.< >
In the following paper, a method for the real-time conversion of whispers to normal phonated speech through a code excited linear prediction analysis-by-synthesis codec is discussed. This approach uses a template of a...
详细信息
In the following paper, a method for the real-time conversion of whispers to normal phonated speech through a code excited linear prediction analysis-by-synthesis codec is discussed. This approach uses a template of a speakerpsilas normal phonated speech for extraction of excitation parameters such as pitch and gain, and then injects these estimated excitations into whispered signal to synthesize normal-sounding speech through the CELP codec. Furthermore, since restoring pitch to whispered speech requires some considerations of quality and accuracy, spectral enhancements are required in terms of formant shifting (LSPs modification) and pitch injection based on voiced/unvoiced decision. Spectral shifting is accomplished through line-spectral pair adjustment. Implementing such methods by using the popular CELP codec allows integration of the technique with any modern speech applications and devices. Subjective testing results are presented to determine the effectiveness of the technique.
Research to develop an algorithm for isolated-word recognition is described. Features extraction is carried out by applying a linear predictive coding (LPC) algorithm with order of 10. To implement and test the propos...
详细信息
Research to develop an algorithm for isolated-word recognition is described. Features extraction is carried out by applying a linear predictive coding (LPC) algorithm with order of 10. To implement and test the proposed algorithm a microcomputer-based data acquisition system has been designed and constructed. To examine the similarity between the reference and the training sets, a back propagation artificial neural net model with three layers is implemented. The adaptation rule implemented in this network is the generalized least mean square (LMS) rule.< >
The authors address the problem of defining a general class of reject-first possibilistic classifiers. It relies on fuzzy XOR operators based on dual triples (t-norm, t-conorm, complement). Such a classifier operates ...
详细信息
The authors address the problem of defining a general class of reject-first possibilistic classifiers. It relies on fuzzy XOR operators based on dual triples (t-norm, t-conorm, complement). Such a classifier operates in two sequential steps. It starts with testing for exclusive classification by thresholding the fuzzy XOR combination of membership degrees to the different classes. If the pattern has to be rejected, the classifier continues by testing for the kind of rejection encountered (i.e. ambiguity or distance) using another threshold on the fuzzy OR combination.
Distributed speech recognition services (DSRSs) provide an anytime, anywhere and any-device speech recognition environment that is intelligent enough to interact with users in a more natural manner. The primary goal i...
详细信息
Distributed speech recognition services (DSRSs) provide an anytime, anywhere and any-device speech recognition environment that is intelligent enough to interact with users in a more natural manner. The primary goal is to provide users with the ability to dictate commands and/or documents among other potential services. The system coordinates the efforts of applications running in a distributed environment. For example, a user is able to dictate a document using their local word processor and a DSRS's remotely located speech engine. DSRSs encourage cooperation among individual programs in order to combine the efforts of individual applications to fulfill a user's request.
A speaker-independent isolated word recognition system is described which is based on some techniques and results from rate-distortion speech coders. The recognition system can be viewed as a minimum distortion or nea...
详细信息
A speaker-independent isolated word recognition system is described which is based on some techniques and results from rate-distortion speech coders. The recognition system can be viewed as a minimum distortion or nearest-neighbor system where the distortion measure is defined between an observed sequence of frames of speech and a reference pattern. The patterns are sequences of sets of LPC models. Every one of the sets of each pattern consist of a collection of LPC models that "best" reproduces a given frame of a word from a training sequence. The Itakura Saito distortion measure is used to design the system (or selection of the patterns) and for the decision step.
We examine the input-output stability of attractors of nonlinear systems to perturbations of finite duration in time. In a low-parameter family of such perturbations, and using techniques from the theory of dynamical ...
详细信息
We examine the input-output stability of attractors of nonlinear systems to perturbations of finite duration in time. In a low-parameter family of such perturbations, and using techniques from the theory of dynamical systems, we formulate a boundary value problem for the location of critical perturbations, marking the boundary of stability. An intricate pattern of stability and instability is found in a simple illustrative example (the CSTR with a single reaction).
A pattern recognition system, able to classify the heart sounds and murmurs perfectly for cardiac diagnosis is presented. A mathematical model to describe the heart sounds and murmurs (HSAM) by a finite number of para...
详细信息
A pattern recognition system, able to classify the heart sounds and murmurs perfectly for cardiac diagnosis is presented. A mathematical model to describe the heart sounds and murmurs (HSAM) by a finite number of parameters was developed. The autoregressive model (AR) is selected to represent the HSAM at principal locations of cardiac auscultation and for different heart diseases. Feature extraction of the pre-emphasized signal, based on fourth-order linear prediction of the cardiac cycle frames, is performed. Pattern classification, based on an optimal dynamic time warping algorithm that minimizes the Euclidean distance between the features of the measured pattern and reference patterns, is verified. A decision, based on a minimum-distance criterion is then made. Furthermore, a bank consisting of 20 reference types of cardiac diseases has been generated.< >
暂无评论