the proceedings contain 43 papers. the topics discussed include: tesselations by connection in orders;nearness in digital images and proximity spaces;morphological operators with discrete line segments;Hausdorff discr...
ISBN:
(纸本)3540413960
the proceedings contain 43 papers. the topics discussed include: tesselations by connection in orders;nearness in digital images and proximity spaces;morphological operators with discrete line segments;Hausdorff discretizations of algebraic sets and diophantine sets;an algorithm for reconstructing special lattice sets from their approximate x-rays;reconstruction of discrete sets with absorption;some properties of hyperbolic networks;the reconstruction of the digital hyperbola segment from its code;determining visible points in a three-dimensional discrete space;Euclidean nets: an automatic and reversible geometric smoothing of discrete 3D object boundaries;object discretization in higher dimensions;strong thinning and polyhedrization of the surface of a voxel object;deformable modeling for characterizing biomedical shape changes;Delaunay surface reconstruction from scattered points;and topological encoding of 3D segmented images.
Segmentation of connected handwritten Chinese characters is a very difficult task in document image analysis. In this paper, a novel algorithm based on stroke analysis and background thinning is proposed to segment co...
详细信息
An accent adaptation approach using pronunciation variation modeling technology for Mandarin accent was proposed in this paper. As Chinese language is monosyllabic, the syllable pronunciation variation dictionary (SPV...
详细信息
ISBN:
(纸本)7801501144
An accent adaptation approach using pronunciation variation modeling technology for Mandarin accent was proposed in this paper. As Chinese language is monosyllabic, the syllable pronunciation variation dictionary (SPVD) was built to depict the characteristics of accent. Firstly, the pronunciation modeling technology was utilized to get the context-independent and contextdependent accent-specific syllable confusion matrix according to the acoustic recognition results (pin-yin stream). then the accentspecific Chinese SPVD was constructed from this confusion matrix. Finally, N-Best acoustic recognition candidates were rescored withthe help of SPVD. To curtail the necessary adaptation data size for context-dependent SPVD, we divided the syllable context into several groups. the experiment results show that pronunciation variation modeling technology is an effective method for Mandarin accent adaptation, and the context grouping strategy can reduce the adapting speech data effectively while keep the same satisfactory performance.
In this paper we present a novel method to reject OOV words for speaker dependent dynamic command set recognition. the OOV rejection problem is regarded as the designing of recognizer with two classes: In-Vocabulary c...
详细信息
ISBN:
(纸本)7801501144
In this paper we present a novel method to reject OOV words for speaker dependent dynamic command set recognition. the OOV rejection problem is regarded as the designing of recognizer with two classes: In-Vocabulary command and OOV command. Multiple soundly confidence measures derived from likelihood score of acoustic match and prosody match are defined and compete with each other at the same level automatically within neural network framework, thus elude choosing balanced sensitive threshold like traditional strategy. the network weights are trained according to Minimum Misclassification Error criterion. the confidence measures take whole command set into account, and objectively describe the difference between the top one and alternative hypotheses. Experimental results show that neural network based combination is rational, reliable and stable with average total error rates 9.3%, outperforming any single confidence measure threshold approach. Also the across verification results show that trained network is independent of speaker, gender and command set. Although there is performance degradation when exported to another conditions, it is acceptable in many applications.
this paper investigates the problem of choosing the training set for language modeling in large vocabulary continuous speech recognition system. From our investigation, we find that the language style is more importan...
详细信息
ISBN:
(纸本)7801501144
this paper investigates the problem of choosing the training set for language modeling in large vocabulary continuous speech recognition system. From our investigation, we find that the language style is more important than the domain in language modeling. Keeping the similarity of language style, extending of domain is not harmful. On the contrast, under this condition, the expanding size of the training set will improve the quality of the language model. Diversity of language styles in the training set will result in the degradation of the language model. the analysis of the correlation between CER and evaluation measures of language model indicates that under condition of same domain, same language style and whole model without cutoff, the perplexity correlates with CER strongly. Otherwise this correlation will be weakened. Another evaluation measure in our investigation, the Ngram hitting rate performs similarly to that of perplexity. To the back-off trigram model, the bigram hitting rate correlates stronger to CER than the trigram-hitting rate, which is meaningful to the size reduction of language model.
In this paper, we introduce the HMM-state sequence confusion characteristics as prior knowledge into the framework of MLLR to relax the transformation and reduce the risks of over-training when adaptation data size is...
详细信息
ISBN:
(纸本)7801501144
In this paper, we introduce the HMM-state sequence confusion characteristics as prior knowledge into the framework of MLLR to relax the transformation and reduce the risks of over-training when adaptation data size is small. there are two issues to be addressed as follows: first, how to estimate such confusion information reliably;second how to use the information in refining the estimation of MLLR adaptation. the pronunciation modeling technology was utilized to build the state sequence confusion table. then the correlation of states is calculated according to the confusion table. Following proposed algorithm made a relaxation in the process of MLLR adaptation when the adaptation data is very small. Our experiment on a Mandarin state-tying triphone toneless LVCSR system showed that error rate reduction is 9.5% over standard MLLR with about 10 utterances (less than 30 seconds) of adaptation data.
In this paper, a new neural network with genetic algorithm (GA) is described. GA can overcome the disadvantages of back propagation (BP) artificial neural network (ANN), such as slow convergence and possibility of bei...
详细信息
In this paper, a new neural network with genetic algorithm (GA) is described. GA can overcome the disadvantages of back propagation (BP) artificial neural network (ANN), such as slow convergence and possibility of being trapped at locally minimum value. Compared with BP-ANN, the convergence and generalization ability of GA-ANN is improved remarkably. Some typical discharges in large turbine generators are presented and discussed. Test results show that the neural network may discriminate unknown patterns successfully. Some new results are given, and practical application of neural network for patternrecognition of PD in with genetic algorithm is also discussed.
In this paper, the concept of long memory systems for forecasting is developed. the pattern Modelling and recognition System and Fuzzy Single Nearest Neighbour methods are introduced as local approximation tools for f...
详细信息
ISBN:
(纸本)0780364295
In this paper, the concept of long memory systems for forecasting is developed. the pattern Modelling and recognition System and Fuzzy Single Nearest Neighbour methods are introduced as local approximation tools for forecasting. Such systems are used for matching current state of the time-series with past states to make a forecast. In the past, the PMRS system has been successfully used for forecasting the Santa Fe competition data. In this paper, we forecast the FTSE 100 and 250 financial returns indices, as well as the stock returns of five FTSE 100 companies and compare the results of the two different systems, withthat of Exponential Smoothing and Random Walk on seven different error measures. the results show that patternrecognition based approaches in time-series forecasting are highly accurate. Simple theoretical trading strategies are also mentioned, highlighting real applications of the system.
In this work the construction of a neural network to perform the task of classification from a set of data for which the true classes are known is investigated. Depending upon the knowledge strored in the architecture...
ISBN:
(纸本)1853128104
In this work the construction of a neural network to perform the task of classification from a set of data for which the true classes are known is investigated. Depending upon the knowledge strored in the architecture of an Adaptive Logic Network (ALN) a specialized neurochip is built. the performance of this architecture is evaluated using a challenging medical data set and conclusions are drawn for the expandability of this neurochip for more general cases.
In this paper, we address the problem of high performance speaker-independent continuous Mandarin digital string recognizer and focus on exploiting context information and prosody knowledge. Data-driven decision tree ...
详细信息
ISBN:
(纸本)7801501144
In this paper, we address the problem of high performance speaker-independent continuous Mandarin digital string recognizer and focus on exploiting context information and prosody knowledge. Data-driven decision tree method to train tri-phone acoustic model was proposed. According to Chinese language property, digital specific question set was designed and the derived tri-phone model is more accurate to describe acoustic observation. For prosody cue, a novel Gaussian Mixture Density Duration Model (GMDDM) was presented. Unlike traditional normalizing or single parameter strategy, proposed duration model is context independent. the context variation is naturally embodied into multiple Gaussian distribution mixture. the number of mixture is automatically selected according maximum likelihood criteria. this simple but effective duration model's likelihood score is combined with acoustic score as heuristic information for the backward A∗ decoding of word graph. Experimental results show the tri-phone acoustic model could lead to average 12.9% reduce of string error rate. When GMDDM model is applied, the string error rate is further reduced by 22.7%, which demonstrates the very usefulness of GMDDM model.
暂无评论