This paper presents a new application of the snap-drift algorithm by S. W. Lee, et al. (2004): phrase recognition using a set of phrases from the Lancaster parsed corpus (LPC) by R. Garside, et al. (1987). The learnin...
详细信息
This paper presents a new application of the snap-drift algorithm by S. W. Lee, et al. (2004): phrase recognition using a set of phrases from the Lancaster parsed corpus (LPC) by R. Garside, et al. (1987). The learning algorithm is the classifier version of snap-drift. In this version, along with the complementary concepts of fast minimalist learning (snap) and slow drift towards the input pattern, each node of the snap-drift neural network (SDNN) swaps between snap and drift modes when declining performance is indicated on that particular node. This method enables the SDNN to learn at node level, in the sense that each node has its learning mode toggled independently of the other nodes. Learning on each node is also reinforced by enabling learning with a probability that decreases with increasing performance. The simulations demonstrate that learning is stable, and the results have consistently shown similar classification performance and advantages in terms of speed in comparison with a multilayer perceptron (MLP) and back-propagation by J. Topper, et al. (2002), D. E. Rumelhart, et al. (1986) applied to the same problem.
We propose a hyperspectral image compressor called BH which considers its input image as being partitioned into square blocks, each lying entirely within a particular band, and compresses one such block at a time by u...
详细信息
We propose a hyperspectral image compressor called BH which considers its input image as being partitioned into square blocks, each lying entirely within a particular band, and compresses one such block at a time by using the following steps: first predict the block from the corresponding block in the previous band, then select a predesigned code based on the prediction errors, and finally encode the predictor coefficient and errors. Apart from giving good compression rates and being fast, BH can provide random access to spatial locations in the image. We hypothesize that BH works well because it accommodates the rapidly changing image brightness that often occurs in hyperspectral images. We also propose an intra-band compressor called LM which is worse than BH, but whose performance helps explain BH's performance.
In this paper, a new network echo canceller based on the practical adaptive filter is proposed. The proposed adaptive filter practically modifies the lattice transversal joint (LTJ) adaptive filter. It takes advantage...
详细信息
In this paper, a new network echo canceller based on the practical adaptive filter is proposed. The proposed adaptive filter practically modifies the lattice transversal joint (LTJ) adaptive filter. It takes advantage of information in the speech decoder and coefficients of the transversal filter part in the LTJ adaptive filter are updated every other sample instead of every sample. Total complexity of the proposed adaptive filter is lower than that of the transversal filter. And the residual echo signal is decreased by residual echo cancellation using the lattice predictor whose order is less than 10. Computational complexity of the proposed echo canceller is lower than that of the transversal filter but the convergence speed is faster than that of the transversal filter. The performance of the proposed network echo canceller was verified by the experiments using the real speech signal.
Here we consider the problem of providing near optimal performance for a large set of possible models. We adopt the H ∞ framework in the single-input single-output (SISO) setting with structured uncertainty: a compac...
详细信息
Here we consider the problem of providing near optimal performance for a large set of possible models. We adopt the H ∞ framework in the single-input single-output (SISO) setting with structured uncertainty: a compact set of controllable and observable plant models of a fixed order; we consider the control problem of designing a controller to minimize the worst case performance. We consider two different feedback configurations, and under a mild assumption we prove that a linear periodic controller (LPC) exists which achieves the objective.
A development of the intelligent wheelchair lab prototype is shown in the paper. VOIC is designed for physically disabled person, who can not control their movements and control the wheelchair with the joystick. The a...
详细信息
A development of the intelligent wheelchair lab prototype is shown in the paper. VOIC is designed for physically disabled person, who can not control their movements and control the wheelchair with the joystick. The article describes basic components of voice recognition and wheelchair control system. Voice recognition begins with input signal sampling, word isolation, LPC cepstral analysis, coefficient dimension reduction and trajectory recognition using fixed point approach with neural networks. Wheelchair control system is divided into system for sensors data acquisition and system for wheelchair steering. The complexity of the voice recognition is reduced using LPC cepstral analysis and coefficient dimension reduction with minimum loss of vital information. A proper decision for the speech recognition using neural networks is supported with experimental results.
In the paper problems related to the classification of singing voice quality are presented. For this purpose a database consisting of singers' sample recordings is constructed and parameters are extracted from rec...
详细信息
In the paper problems related to the classification of singing voice quality are presented. For this purpose a database consisting of singers' sample recordings is constructed and parameters are extracted from recorded voice of trained and untrained singers. The parameterization process is based on both voice source and formant analysis of a singing voice. These parameters are explained as to their physical interpretation and analyzed statistically in order to diminish their number. The statistical analysis is based on the Fisher statistic. In such a way a feature vector of a singing voice is formed. Decision systems based on neutral networks and rough sets are utilized in the context of the voice type and voice quality classification. Results obtained in the automatic classification performed by both decision systems are compared. A possibility to classify automatically type/quality of voice is judged. The methodology proposed provides means for discerning trained and untrained singers.
Several pre-processing algorithms modify the residual speech signal to facilitate efficient estimation of speech model parameters. This, however, can result in misalignment between the modified residual signal and the...
详细信息
Several pre-processing algorithms modify the residual speech signal to facilitate efficient estimation of speech model parameters. This, however, can result in misalignment between the modified residual signal and the time-variant linear prediction (LP) filter used during the synthesis stage. The resulting misalignment may cause audible artifacts particularly at onsets when the frequency response of successive LP filters changes rapidly. We propose a new solution to control the LP filter gain at subframes. This technique is performed before and after time modification of speech and therefore is called preanalysis and post-processing. A pitch smoothing technique is used to illustrate the effect of the proposed technique
Sinusoidal coding of an audio subject to a bit-rate constraint, in general, results in a noise-like residual signal. This residual signal is of high perceptual importance; reconstruction of audio using the sinusoidal ...
详细信息
Sinusoidal coding of an audio subject to a bit-rate constraint, in general, results in a noise-like residual signal. This residual signal is of high perceptual importance; reconstruction of audio using the sinusoidal representation only typically results in an artificial sounding reconstruction. We present a new method, called perceptual linear predictive coding (PLPC), where the residual is encoded by applying LPC in the perceptual domain. This method minimizes a perceptual modelling error and therefore represents only residual components that are of perceptual relevance, while automatically discarding components masked by the sinusoidally coded part. Subjective listening tests show that PLPC performs significantly better than ordinary LPC as a sinusoidal residual coding technique. Furthermore, PLPC combined with a flexible segmentation and model order allocation algorithm leads to a significant gain in terms of R/D performance for fragments with fast changing characteristics.
作者:
Härmä, AAalto Univ
Lab Acoust & Audio Signal Proc Espoo 02015 Finland
In conventional one-step forward linear prediction, an estimate for the current sample value is formed as a linear combination of previous sample values. In this paper, a generalized form of this scheme is studied. He...
详细信息
In conventional one-step forward linear prediction, an estimate for the current sample value is formed as a linear combination of previous sample values. In this paper, a generalized form of this scheme is studied. Here, the prediction is not based simply on the previous sample values but to the signal history as seen through an arbitrary filterbank. It is shown in the paper how the coefficients of a modified model can be obtained and how the inverse and synthesis filters can be implemented. Various properties of such systems are derived in this article. As an example, a novel linearpredictive system using inherently logarithmic frequency representation is introduced.
暂无评论