检索结果-内蒙古大学图书馆

Maximum Entropy Model for Example-Based Machine Translation

International Journal of Computer processing of languages 2007年第2N03期20卷 101-113页

作者： YIN CHEN MUYUN YANG SHENG LI MOE-MS Key Laboratory of Natural Language Processing and Speech Harbin Institute of Technology P.O. Box 321 No. 92 West Dazhi Street NanGang Harbin 150001 China

Most example-based machine translation (EBMT) systems handle their translation examples using some heuristic measures based on human intuition. However, these heuristic rules are usually hard to be effectively organized to scale to incorporate diverse features to cover more language phenomenon and large domains. In this paper, we use machine learning approach for EBMT model design instead of human intuition. Maximum entropy (ME) model is introduced in order to adequately incorporate different kinds of features inherited in the translation examples effectively. At the same time, a multi-dimensional feature space is formally constructed to include various features of different aspects. In the experiments, the proposed model shows significant performance improvement.

关键词： EBMT Machine learning Maximum entropy Feature space

来源：评论

学校读者我要写书评

暂无评论

Meta-structure transformation model for statistical machine translation 07

Meta-structure transformation model for statistical machine ...

引用

Proceedings of the Second Workshop on Statistical Machine Translation

作者： Jiadong Sun Zhao Tiejun Huashen Liang MOE-MS Key Lab of National Language Processing and speech Harbin Institute of Technology Harbin Heilongjiang China

We propose a novel syntax-based model for statistical machine translation in which meta-structure (ms) and meta-structure sequence (Sms) of a parse tree are defined. In this framework, a parse tree is decomposed into Sms to deal with the structure divergence and the alignment can be reconstructed at different levels of recombination of ms (RM). RM pairs extracted can perform the mapping between the sub-structures across languages. As a result, we have got not only the translation for the target language, but an Sms of its parse tree at the same time. Experiments with BLEU metric show that the model significantly outperforms Pharaoh, a state-art-the-art phrase-based system.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Chinese Terminology Extraction Using Bilingual Web Resources

Chinese Terminology Extraction Using Bilingual Web Resources

引用

IEEE International Conference on natural language processing and Knowledge Engineering (NLP-KE)

作者： Yuhang Yang Luning Ji Qin Lu Tiejun Zhao MOE-MS Key Laboratory of Natural Language Processing and Speech Harbin Institute of Technology Harbin China Department of Computing Hong Kong Polytechnic University Hong Kong China

Automatic terminology extraction requires termhood verification for extracted terms in a specific domain. Chinese terminology extraction suffers from insufficient domain corpora for verification even though there is abundance of information in other languages. This paper presents a novel approach to overcome this problem by using word translations and bilingual web resources to improve both coverage and precision. The proposed approach incorporates bilingual information from within candidate terms themselves and from existing domain knowledge to conduct termhood calculation. In contrast to previous researches, this method is not confined to only pre-determined corpora. Preliminary experiments show a 14.8% improvement in coverage and 26.3% improvement in precision, respectively.

关键词： Terminology Data mining natural languages World Wide Web Laboratories natural language processing speech processing Frequency measurement Counting circuits Statistics

来源：评论

学校读者我要写书评

暂无评论

Chinese Information processing and Its Prospects

引用

Journal of Computer Science & technology 2006年第5期21卷 838-846页

作者：李生赵铁军 MOE-MS Key Laboratory of Natural Language Processing and Speech Harbin Institute of Technology Harbin 150001 P.R. China

The paper presents some main progresses and achievements in Chinese information processing. It focuses on six aspects, i.e., Chinese syntactic analysis, Chinese semantic analysis, machine translation, information retrieval, information extraction, and speech recognition and synthesis. The important techniques and possible key problems of the respective branch in the near future are discussed as well.

关键词： Chinese information processing natural language processing computational linguistics

来源：评论

学校读者我要写书评

暂无评论

A divide-conquer strategy for English text chunking

A divide-conquer strategy for English text chunking

引用

2006 International Conference on Machine Learning and Cybernetics

作者： Liang, Ying-Hong Wang, Ni-Hong Su, Jian-Min Ren, Hong-E. School of Information and Computer Engineering North East Forestry University Harbin 150040 MOE-MS Key Laboratory of Natural Language Processing and Speech Harbin Institute of Technology Harbin 150001

ISBN: (纸本)1424400619

The traditional English text chunking approach identifies phrases by using only one model and phrases with the same types of features. It has been shown that the limitations of using only one model are that: the use of the same types of features is not suitable for all phrases, and data sparseness may also result. In this paper, the divide-conquer approach is proposed and applied in the identification of English phrases. This strategy divides the task of chunking into several sub-tasks according to sensitive features of each phrase and identifies different phrases in parallel. Then, a two-stage decreasing conflict strategy is used to synthesize each sub-task's answer. By applying and testing the approach on the public training and test corpus, the F score for arbitrary phrases identification using divide-conquer strategy achieves 94.14% compared to the previous best F score of 94.17%. © 2006 IEEE.

关键词： Text processing

来源：评论

学校读者我要写书评

暂无评论

A Multi-Agent Strategy Chinese Text for Both English and Chunking

引用

电子学报(英文版) 2006年第3期15卷 422-426页

作者： LIANG Yinghong ZHAO Tiejun YAO Jianmin YU Hao MOE-MS Key Laboratory of Natural Language Processing and Speech Harbin Institute of Technology Harbin 150001 China School of Information and Computer Engineering North East Forestry University Harbin 150080 China

来源：评论

学校读者我要写书评

暂无评论

RESEARCH ON CHINESE INFORMATION RETRIEVAL BASED ON A HYBRID language MODELING

RESEARCH ON CHINESE INFORMATION RETRIEVAL BASED ON A HYBRID ...

引用

2006 International Conference on Machine Learning and Cybernetics(IEEE第五届机器学习与控制论坛)

作者： DE-QUAN ZHENG HAO YU TIE-JUN ZHAO SHENG LI FENG YU School of Computer and Information Engineering Harbin University of Commerce Harbin 150001 MOE-MS MOE-MS Key Laboratory of Natural Language Processing and Speech Harbin Institute of Technology Har School of Computer and Information Engineering Harbin University of Commerce Harbin 150001

For information retrieval, users hope to acquire more relevant information from the top indexing documents. In this paper, a combination of Ontology with statistical method is presented to retrieval initial document set and improve the precision of top N ranking documents by re-ranking document set. The experiment with NTCIR-3 Chinese CLIR dataset shows the proposed method improved the precision of information retrieval.

关键词： Ontology statistical method linguistic Ontology knowledge information retrieval knowledge acquisition

来源：评论

学校读者我要写书评

暂无评论

Documents Ranking Based on a Hybrid language Model for Chinese Information Retrieval

Documents Ranking Based on a Hybrid Language Model for Chine...

引用

International Conference on Information and Automation (ICIA)

作者： Dequan Zheng Feng Yu Tiejun Zhao Sheng Li MOE-MS Key Laboratory ofNatural Language Processing and Speech Harbin Institute of Technology Harbin China School of Computer and Information Engineering Harbin University of Commerce Harbin China

For information retrieval, users hope to acquire more relevant information from the top N ranking documents. In this paper, a hybrid Chinese language model is presented, which is defined as a combination of ontology with statistical method, to improve the precision of top N ranking documents by reordering the initial retrieval documents. The experiment with NTCIR-3 formal Chinese test collection shows the proposed method improved the precision at top N ranking documents level

关键词： natural languages Information retrieval Ontologies Indexing Statistical analysis Semantic Web Business Laboratories speech processing Testing

来源：评论

学校读者我要写书评

暂无评论

Chinese-English Cross-Lingual Information Retrieval based on Domain Ontology Knowledge

Chinese-English Cross-Lingual Information Retrieval based on...

引用

International Conference on Computational Intelligence and Security

作者： Feng Yu Dequan Zheng Tiejun Zhao Sheng Li Hao Yu School of Computer and Information Engineering Harbin University of Commerce Harbin China MOE-MS Key Laboratory of Natural Language Processingand Speech Harbin Institute of Technology Harbin China

ISBN: (纸本)1424406048

For improving the effectiveness of cross-lingual information retrieval (CLIR), a domain ontology knowledge based method is presented to apply to C-E CLIR. In this study, the domain ontology knowledge is acquired from both source language user queries and target documents to select target translation and re-rank initial retrieval documents set. The C-E CLIR dataset from NTCIR-4 Workshop is used to evaluate the effectiveness of this method. Different from previous works, we make use of source language user queries in total C-E CLIR and compared with previous works, this method improved the precision

关键词： Information retrieval Ontologies natural languages Dictionaries Knowledge engineering natural language processing Semantic Web Frequency Indexing Business

来源：评论

学校读者我要写书评

暂无评论

TEXT CLASSIFICATION BASED ON A COMBINATION OF ONTOLOGY WITH STATISTICAL METHOD

TEXT CLASSIFICATION BASED ON A COMBINATION OF ONTOLOGY WITH ...

引用

2006 International Conference on Machine Learning and Cybernetics(IEEE第五届机器学习与控制论坛)

作者： FENG YU DE-QUAN ZHENG SHENG LI TIE-JUN ZHAO HAO YU School of Computer and Information Engineering Harbin University of Commerce Harbin 150076 China School of Computer and Information Engineering Harbin University of Commerce Harbin 150076 China MOE-MS Key Laboratory of Natural Language Processing and Speech Harbin Institute of TechnologyHarb

Text classification is becoming one of the key techniques in organizing and handling a large amount of text data. In this paper, a combination of ontology with statistical method is presented to improve the precision of text classification. In this study, first, different kind of linguistic ontology knowledge will be respectively acquired by learning training corpus to determine text classifiers. For an actual document,the semantic evaluation value of the document will respectively be gotten by different kind of linguistic ontology knowledge and the categories will be judged by the highest evaluation value. Compared with Bayes, k-nearest neighbor and support vector machine, the proposed approach outperforms previous works.

关键词： Text classification ontology statistical method linguistic ontology knowledge

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：