In this paper we review the main intermediate forms proposed in text mining, and we briey study some fuzzy counterparts. The concept of intermediate form applies to any knowledge rep- resentation employed to represent...
详细信息
ISBN:
(纸本)8476538723
In this paper we review the main intermediate forms proposed in text mining, and we briey study some fuzzy counterparts. The concept of intermediate form applies to any knowledge rep- resentation employed to represent in a structured way the semantic content of a text corpus. In- termediate forms play a central role in the text mining process since it is necessary to transform plain text into a form in order to apply mining techniques. Since the semantics of text use to be imprecise, the use of fuzzy intermediate forms seems to be a natural solution in many cases. We discuss about fuzzy intermediate forms and the corresponding fuzzy text mining techniques that may be applicable on them.
Due to the convenience of pervasive information environment, many people use various computing devices to perform plenty kinds of tasks. In the field of education, there are various applications to facilitate learner,...
详细信息
ISBN:
(纸本)0769522459
Due to the convenience of pervasive information environment, many people use various computing devices to perform plenty kinds of tasks. In the field of education, there are various applications to facilitate learner, especially for e-learning. However, some computing devices suffer from the limited resources and can not accept rich information content. Therefore, the information content has to be tailored into different kinds of presentation depending on the types of computing devices. Context sensitivity is an application software system's ability to sense and analyze context from various sources. In this paper, we aim to customize static documents using context-sensitive middleware (CSM) to sense the computing device, and then using the agent-based parser to provide suitable content representation dynamically.
We describe a method to automatically discover translation collocations from a bilingual corpus and how these improve a machine translation system. The process of inference of collocations is iterative: An alignment i...
详细信息
ISBN:
(纸本)9781586034528
We describe a method to automatically discover translation collocations from a bilingual corpus and how these improve a machine translation system. The process of inference of collocations is iterative: An alignment is used to derive an initial set of collocations, these are used in turn to improve the alignment and this new alignment is used to generate new collocations. This process is repeated until no more collocations are found. The final alignment and the set of collocations are used to train a translation model. We use a model that is based on finite state transducers and word clusters and has been modified to work with collocations in addition to single words. We present experiments in which we show that automatic collocations improve translation quality without prior linguistic information.
In this paper we study cellular evolutionary algorithms, a kind of decentralized heuristics, and the importance of the induced exploration/exploitation balance on different problems. It is shown that, by choosing sync...
详细信息
ISBN:
(纸本)0780385152
In this paper we study cellular evolutionary algorithms, a kind of decentralized heuristics, and the importance of the induced exploration/exploitation balance on different problems. It is shown that, by choosing synchronous or asynchronous update policies, the selection pressure, and thus the exploration/exploitation tradeoff, can be influenced directly, without using additional ad hoc parameters. Synchronous algorithms of different neighborhood-to-topology ratio, and asynchronous update policies are applied to a set of benchmark problems. Our conclusions show that the update methods of the asynchronous versions, as well as the ratio of the decentralized algorithm, have a marked influence on its convergence and on its accuracy.
The various aspects of the application of multiagent decision support system (DSS) for transportation management are discussed. The DSS prototype assists operators in their management task, helping them to configure c...
详细信息
ISBN:
(纸本)1581138644
The various aspects of the application of multiagent decision support system (DSS) for transportation management are discussed. The DSS prototype assists operators in their management task, helping them to configure consistent control plans for the whole road networks. For the prototype, a simulator was implemented, which, based on the actual bus schedules, emulates the exploitation support system. The implementation of the prototype, that required the integration of various software technologies and tools, has been initially complex and required suitable amount of programming work.
Due to the convenience of pervasive information environment, many people use various computing devices to perform plenty kinds of tasks. In the field of education, there are various applications to facilitate learner,...
详细信息
Due to the convenience of pervasive information environment, many people use various computing devices to perform plenty kinds of tasks. In the field of education, there are various applications to facilitate learner, especially for e-learning. However, some computing devices suffer from the limited resources and cannot accept rich information content. Therefore, the information content has to be tailored into different kinds of presentation depending on the types of computing devices. Context sensitivity is an application software system's ability to sense and analyze context from various sources. In this paper, we aim to customize static documents using context-sensitive middleware (CSM) to sense the computing device, and then using the agent-based parser to provide suitable content representation dynamically.
To reduce speech recognition error rate we can use better statistical language models. These models can be improved by grouping words into word equivalence classes. Clustering algorithms can be used to automatically d...
详细信息
To reduce speech recognition error rate we can use better statistical language models. These models can be improved by grouping words into word equivalence classes. Clustering algorithms can be used to automatically do this word grouping. We present an incremental clustering algorithm and two iterative clustering algorithms. Also, we compare them with previous algorithms. The experimental results show that the two iterative algorithms perform as well as previous ones. It should be pointed out that one of them, that uses the leaving one out technique, has the ability to automatically determine the optimum number of classes. These iterative algorithms are used by the incremental one. On the other hand, the proposed incremental algorithm achieves the best results of the compared algorithms, its behavior is the most regular with the variation of the number of classes and can automatically determine the optimum number of classes.
This paper studies speech-driven Web retrieval models which accepts spoken search topics (queries) in the NTCIR-3 Web retrieval task. The major focus of this paper is on improving speech recognition accuracy of spoken...
详细信息
This paper studies speech-driven Web retrieval models which accepts spoken search topics (queries) in the NTCIR-3 Web retrieval task. The major focus of this paper is on improving speech recognition accuracy of spoken queries then improving retrieval accuracy in speech-driven Web retrieval. We experimentally evaluate the techniques of combining outputs of multiple LVCSR models in recognition of spoken queries. As model combination techniques, we compare the SVM learning technique conventional voting schemes such as ROVER. We show that the techniques of multiple LVCSR model combination can achieve improvement both in speech recognition and retrieval accuracies in speech-driven text retrieval. We also show that model combination by SVM learning outperforms conventional voting schemes both in speech recognition retrieval accuracies.
In the majority of analytical and imitation models analyzing the processes of communication networks functioning, input streams are associated with the well known Poisson's model, describing the most unfavorable c...
详细信息
In the majority of analytical and imitation models analyzing the processes of communication networks functioning, input streams are associated with the well known Poisson's model, describing the most unfavorable case of stationary random streams. In practice the input streams are characterized mainly with strong nonstationary processes, impacting the final results of modeling to a significant degree. Having this in mind this report presents an input stream model for simulation modeling and analysis of communication networks.
For many practical applications of speech recognition systems, it is quite desirable to have an estimate of confidence for each hypothesized word. Unlike previous works on confidence measures, we have proposed feature...
详细信息
ISBN:
(纸本)0780376633
For many practical applications of speech recognition systems, it is quite desirable to have an estimate of confidence for each hypothesized word. Unlike previous works on confidence measures, we have proposed features for confidence measures that are extracted from outputs of more than one LVCSR models. For further analysis of the proposed confidence measure, this paper examines the correlation between each word's confidence and the word's features such as its part-of-speech and syllable length. We then apply SVM learning technique to the task of combining outputs of multiple LVCSR models, where, as features of SVM learning, information such as the pairs of the models which output the hypothesized word are useful for improving the word recognition rate. Experimental results show that the combination results achieve a relative word error reduction of up to 72 % against the best performing single model and that of up to 36 % against ROVER.
暂无评论