文献详情 >Distributional word clusters v... 收藏

Distributional word clusters vs. words for text categorization

作者：Bekkerman, Ron El-Yaniv, Ran Tishby, Naftali Winter, Yoad

作者机构：Department of Computer Science Technion - Israel Institute of Technology Haifa 32000 Israel School of Computer Science and Engineering Center for Neural Computation Hebrew University Jerusalem 91904 Israel

出版物：《Journal of Machine Learning Research》 (J. Mach. Learn. Res.)

年卷期：2003年第3卷

页面：1183-1208页

核心收录：

学科分类：1205[管理学-图书情报与档案管理] 08[工学] 0835[工学-软件工程] 0812[工学-计算机科学与技术（可授工学、理学学位）]

基　　金：US Constitution Articles I II and III. Ibid . 14 CFR §§400-1199 (2008). 49 USC §70101 (2000 Suppl. 2004). See e.g. Project of the Nuclear Age Peace Foundation Presidential Directive on National Space Policy http://nuclearfiles.org/menu/key-issues/space-weapons/issues/national-space-policy-presidential-directive.html (last visited 1 October 2008). T.R. Hughes E. Rosenberg

主　　题：Support vector machines

摘要：We study an approach to text categorization that combines distributional clustering of words and a Support Vector Machine (SVM) classifier. This word-cluster representation is computed using the recently introduced Information Bottleneck method, which generates a compact and efficient representation of documents. When combined with the classification power of the SVM, this method yields high performance in text categorization. This novel combination of SVM with word-cluster representation is compared with SVM-based categorization using the simpler bag-of-words (BOW) representation. The comparison is performed over three known datasets. On one of these datasets (the 20 Newsgroups) the method based on word clusters significantly outperforms the word-based representation in terms of categorization accuracy or representation efficiency. On the two other sets (Reuters-21578 and WebKB) the word-based representation slightly outperforms the word-cluster representation. We investigate the potential reasons for this behavior and relate it to structural differences between the datasets. © 2003 Ron Bekkerman, Ran El-Yaniv, Naftali Tishby, and Yoad Winter.

本地馆藏 | 借阅须知 | 我要预约

已订购，未入库

sda

目录详情 | 试阅读 |

读者评论与其他读者分享你的观点

学校读者

用户名:未登录

我的评分

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

Distributional word clusters vs. words for text categorization

读者评论与其他读者分享你的观点

请选择收藏分类：

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

Distributional word clusters vs. words for text categorization

读者评论 与其他读者分享你的观点

请选择收藏分类： 新增自定义分类 确定 取消

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

读者评论与其他读者分享你的观点

请选择收藏分类：