While studies of the role of fuzzy logic in naturallanguage certainly exist, it is not clear that the use of fuzzy logic to represent linguistic constructs is anything more than an engineering convenience. This paper...
详细信息
While studies of the role of fuzzy logic in naturallanguage certainly exist, it is not clear that the use of fuzzy logic to represent linguistic constructs is anything more than an engineering convenience. This paper suggests that one reason this situation obtains is because fuzzy logic has been used strictly to elucidate static aspects of naturallanguage (particularly aspects of the lexicon). If one examines dynamic features of naturallanguage, on the other hand, new possibilities for connections between fuzzy logic and naturallanguage emerge. In particular, some results from category theory are used to show that fuzzy logic can have a role in explaining certain otherwise rather obscure properties of linguistic comparatives in English.
naturallanguageprocessing is a technique that includes both naturallanguage understanding and naturallanguage generation. Translating one naturallanguage into another becomes complex due to structural difference,...
详细信息
naturallanguageprocessing is a technique that includes both naturallanguage understanding and naturallanguage generation. Translating one naturallanguage into another becomes complex due to structural difference, varieties of meanings, different forms of verbs etc. In this paper, a general algorithm has been developed which takes one naturallanguage (i.e. English, German etc) as input and produces another naturallanguage (i.e. Bengali, Japanese etc.) as output reserving the same expression. For understanding the input naturallanguage, a smart-parse algorithm has been developed which unambiguously and efficiently generates a stack of the parse tree from the existing structure. After parsing, it incorporates itself with the knowledge base and dictionary and finally produces the corresponding targeted naturallanguage. After fixing up the input and output naturallanguage, it is possible to use this algorithm for translation after simple modification.
This paper presents a purely statistical method for the automatic syllabification of speech. A hierarchical HMM structure is used to implement a purely acoustical model based on the phonotactic constraints found in th...
详细信息
This paper presents a purely statistical method for the automatic syllabification of speech. A hierarchical HMM structure is used to implement a purely acoustical model based on the phonotactic constraints found in the English language. A well-defined DTW distance measure is presented for measuring and reporting syllabification results. We achieve a token error rate of 20.3 % with a 42 ms average boundary error on a relatively large set of data. This compares well with previous knowledge- and statistically based methods.
We describe a unified probabilistic framework for statistical language modeling-the latent maximum entropy principle-which can effectively incorporate various aspects of naturallanguage, such as local word interactio...
详细信息
We describe a unified probabilistic framework for statistical language modeling-the latent maximum entropy principle-which can effectively incorporate various aspects of naturallanguage, such as local word interaction, syntactic structure and semantic document information. Unlike previous work on maximum entropy methods for language modeling, which only allow explicit features to be modeled, our framework also allows relationships over hidden features to be captured, resulting in a more expressive language model. We describe efficient algorithms for marginalization, inference and normalization in our extended models. We then present experimental results for our approach on the Wall Street Journal corpus.
This paper proposes a new approach to construction of rule bases for the transferred-based machine translation. In our approach, the rule bases are constructed in combination of the linguistics knowledge and large sca...
详细信息
ISBN:
(纸本)0780379527
This paper proposes a new approach to construction of rule bases for the transferred-based machine translation. In our approach, the rule bases are constructed in combination of the linguistics knowledge and large scale of corpora. On the one hand, the lexical knowledge, the syntactic knowledge and the semantic knowledge are all used in the rules. On the other hand, the knowledge is used for the statistics and self-learning rules. In each rule base, all rules are scored and ranked. Thus, an impersonal choice for the sentence can be made. The preliminary experimental results show that the approach may increase the speed to build the rule base and improve the quality of rules.
A lexical knowledge base is an important component of any intelligent information processing system. The WordNet developed at the Cognitive Systems Laboratories at Princeton has served as a lexical reference system fo...
详细信息
A lexical knowledge base is an important component of any intelligent information processing system. The WordNet developed at the Cognitive Systems Laboratories at Princeton has served as a lexical reference system for naturallanguageprocessing activities. The Indian language based activities at our institute mainly in text-to-speech synthesis and naturallanguage generation from iconic inputs require the inclusion of additional features in the lexical reference system like phonology, word roots, and etymological information. Our initial efforts have been in Hindi and Bengali but commonality of Indo Aryan languages and the importance of these extra features lead us to believe that it is a worthwhile effort to build-up a WordNet for other Indo Aryan languages containing these features. In this paper, we speak of the issues relating to the structured design and development of a generalized extended WordNet for Indo Aryan languages with special reference to Hindi and Bengali.
In naturallanguageprocessing applications, string matching is the main time-consuming operation. A dedicated co-processor for string matching that uses memory interleaving and parallel processing techniques can reli...
详细信息
In naturallanguageprocessing applications, string matching is the main time-consuming operation. A dedicated co-processor for string matching that uses memory interleaving and parallel processing techniques can relieve the host CPU from this burden. This paper reports the FPGA design of such a system with m parallel matching units. It has been shown to improve the performance by a factor of nearly m, without increasing the chip area by more than 45% The time complexity of the proposed algorithm is O(log/sub 2/ n), where n is the number of lexical entries. The memory used by the lexicon has been efficiently organized and the space saving achieved is about 67%.
The knowledge management is becoming more and more important in organizations, either over the intranet or Internet. In this paper we present an ontology-based Web knowledge management (KM) framework based on Web onto...
详细信息
ISBN:
(纸本)0780381858
The knowledge management is becoming more and more important in organizations, either over the intranet or Internet. In this paper we present an ontology-based Web knowledge management (KM) framework based on Web ontology language DAML+OIL. This framework supports content-oriented rather than traditionally document-oriented approach to knowledge management. Three fundamental building blocks, i.e., annotations based on ontologies, knowledge-bases based on assertions in ontologies and Web resources crawling, and rule-based reasoning/inference systems for semantic knowledge manipulation. Our approach to knowledge management is the result of our semantic Web research efforts. We adopt the Web-standard based tools in our development of knowledge management. We believe that this approach of annotation-crawling-inference (A-C-I) to knowledge management is flexible and effective in supporting knowledge sharing on the Web. An ongoing prototype is briefly described.
The automatic acquisition of usable domain knowledge is a challenging *** knowledge can be employed to assist a user in searching a document collection. This can be done by suggesting query modification options based ...
详细信息
The automatic acquisition of usable domain knowledge is a challenging *** knowledge can be employed to assist a user in searching a document collection. This can be done by suggesting query modification options based on the knowledge uncovered by analyzing the document *** acquire such knowledge by simply exploiting the documents' markup *** gives us a domain model tailored to the particular *** how good is such a model? This paper will present results of two *** first one looks at the actual domain *** will discuss how users judged the relations encoded in the *** second evaluation is task-based and goes a step *** investigates how a search system that applies the automatically constructed domain model performs compared to a standard search system.
暂无评论