咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Hierarchical clustering of XML... 收藏

Hierarchical clustering of XML documents focused on structural components

XML 文件的层次聚类集中了于结构的部件

作     者:Costa, Gianni Manco, Giuseppe Ortale, Riccardo Ritacco, Ettore 

作者机构:Italian Natl Res Council CNR Inst High Performance Comp & Networks ICAR I-87036 Arcavacata Di Rende CS Italy 

出 版 物:《DATA & KNOWLEDGE ENGINEERING》 (数据与知识工程)

年 卷 期:2013年第84卷

页      面:26-46页

核心收录:

学科分类:08[工学] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

主  题:Data Mining Semi-structured data and XML XML clustering XML transactional representation XML cluster summarization and representative 

摘      要:Clustering XML documents by structure is the task of grouping them by common structural components. Hitherto, this has been accomplished by looking at the occurrence of one preestablished type of structural components in the structures of the XML documents. However, the a-priori chosen structural components may not be the most appropriate for effective clustering. Moreover, it is likely that the resulting clusters exhibit a certain extent of inner structural inhomogeneity, because of uncaught differences in the structures of the XML documents, due to further neglected forms of structural components. To overcome these limitations, a new hierarchical approach is proposed, that allows to consider (if necessary) multiple forms of structural components to isolate structurally-homogeneous clusters of XML documents. At each level of the resulting hierarchy, clusters are divided by considering some type of structural components (unaddressed at the preceding levels), that still differentiate the structures of the XML documents. Each cluster in the hierarchy is summarized through a novel technique, that provides a clear and differentiated understanding of its structural properties. A comparative evaluation over both real and synthetic XML data proves that the devised approach outperforms established competitors in effectiveness and scalability. Cluster summarization is also shown to be very representative. (C) 2012 Elsevier B.V. All rights reserved.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分