咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Person-specific named entity r... 收藏

Person-specific named entity recognition using SVM with rich feature sets

Person-specific named entity recognition using SVM with rich feature sets

作     者:Hui NIE 

作者机构:School of Information ManagementSun Yat-sen University 

出 版 物:《Chinese Journal of Library and Information Science》 (中国文献情报(英文版))

年 卷 期:2012年第5卷第3期

页      面:27-46页

学科分类:081203[工学-计算机应用技术] 08[工学] 0835[工学-软件工程] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

基  金:support by the Special Research Fundation for Young Teachers of Sun Yat-sen University(Grant No.2000-3161101) Humanity and Social Science Youth Foundation of Ministry of Educationof China(Grant No.08JC870013) 

主  题:Named entity recognition Natural language processing SVM-based classifier Feature selection 

摘      要:Purpose: The purpose of the study is to explore the potential use of nature language process(NLP) and machine learning(ML) techniques and intents to find a feasible strategy and effective approach to fulfill the NER task for Web oriented person-specific information ***/methodology/approach: An SVM-based multi-classification approach combined with a set of rich NLP features derived from state-of-the-art NLP techniques has been proposed to fulfill the NER task. A group of experiments has been designed to investigate the influence of various NLP-based features to the performance of the system,especially the semantic features. Optimal parameter settings regarding with SVM models,including kernel functions,margin parameter of SVM model and the context window size,have been explored through experiments as ***: The SVM-based multi-classification approach has been proved to be effective for the NER task. This work shows that NLP-based features are of great importance in datadriven NE recognition,particularly the semantic features. The study indicates that higher order kernel function may not be desirable for the specific classification problem in practical application. The simple linear-kernel SVM model performed better in this case. Moreover,the modified SVM models with uneven margin parameter are more common and flexible,which have been proved to solve the imbalanced data problem *** limitations/implications: The SVM-based approach for NER problem is only proved to be effective on limited experiment data. Further research need to be conducted on the large batch of real Web data. In addition,the performance of the NER system need be tested when incorporated into a complete IE ***/value: The specially designed experiments make it feasible to fully explore the characters of the data and obtain the optimal parameter settings for the NER task,leading to a preferable rate in recall,precision and F1measures. The overall syste

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分