咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >An Investigation on Transforma... 收藏

An Investigation on Transformation-based Error-driven Learning Algorithm for Chinese Noun Phrase Extraction

作     者:KAM-FAI WONG TIMOTHY KUN-CHUNG CHAN CHUN-HUNG CHENG 

作者机构:Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong Shatin N.T. Hong Kong China 

出 版 物:《International Journal of Computer Processing of Languages》 

年 卷 期:2001年第14卷第1期

页      面:47-69页

学科分类:08[工学] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

摘      要:Noun phrases are commonly used for generating index terms for information retrieval systems. Therefore, we need an effective noun phrase extraction method. In this paper, we propose an approach to extract maximal noun phrases from Chinese text. Although previous studies have been proposed to extract noun phrases, most of them are only applicable to Western languages. To the best of our knowledge, very few has handled Chinese text. Many existing approaches for Western languages made use of statistical methods. However, due to the complicated structure of maximal Chinese noun phrase, pure statistical approaches are not effective. We attempt to improve the performance of a statistical method by integrating it with the transformation-based error-driven learning (TEL) technique. Our methodology includes two modules. The first module applies a statistical method to extract Chinese noun phrases. The performance of this approach, in terms of precision and recall, is investigated. The second module applies the TEL algorithm to further refine the output of the first module. The TEL algorithm automatically learns a set of transformation rules to fix the errors that are obtained through comparing the output of the first module with the correctly annotated corpus. The learned rules can be applied to sentences in any corpus one by one to correct the errors. The TEL algorithm is shown to be effective in improving the precision.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分