检索结果-内蒙古大学图书馆

Producing relevant interests from social networks by mining users' tagging behaviour: A first step towards adapting social information

引用

data & KNOWLEDGE ENGINEERING 2017年 108卷 15-29页

作者： Mezghani, Manel Peninou, Andre Zayani, Corinne Amel Amous, Ikram Sedes, Florence Sfax Univ MIRACL Lab Sfax Tunisia Univ Toulouse CNRS IRIT INPTUPSUT1UT2J Toulouse France

Social media provides an environment of information exchange. They principally rely on their users to create content, to annotate others' content and to make on-line relationships. The user activities reflect his opinions, interests, etc. in this environment. We focus on analysing this social environment to detect user interests which are the key elements for improving adaptation. This choice is motivated by the lack of information in the user profile and the inefficiency of the information issued from methods that analyse the classic user behaviour (e.g. navigation, time spent on web page, etc.). So, having to cope with an incomplete user profile, the user social network can be an important data source to detect user interests. The originality of our approach is based on the proposal of a new technique of interests' detection by analysing the accuracy of the tagging behaviour of a user in order to figure out the tags which really reflect the content of the resources. So, these tags are somehow comprehensible and can avoid tags "ambiguity" usually associated to these social annotations. The approach combines the tag, user and resource in a way that guarantees a relevant interests detection. The proposed approach has been tested and evaluated in the Delicious social database. For the evaluation, we compare the result issued from our approach using the tagging behaviour of the neighbours (the egocentric network and the communities) with the information yet known for the user (his profile). A comparative evaluation with the classical tag-based method of interests detection shows that the proposed approach is better.

关键词： User interests Tagging behaviour Resource Indexation Social network Adaptation Indexing methods semi-structured data and xml

来源：评论

学校读者我要写书评

暂无评论

A user-oriented semantic annotation approach to knowledge acquisition and conversion

引用

JOURNAL OF INFORMATION SCIENCE 2017年第3期43卷 393-411页

作者： Hao, Tianyong Zhu, Chunshen Mu, Yuanyuan Liu, Gang Guangdong Univ Foreign Studies Sch Informat Guangzhou Guangdong Peoples R China City Univ Hong Kong Dept Chinese & Hist Hong Kong Hong Kong Peoples R China Hefei Univ Technol Ctr Corpus Based Translat Studies Hefei Anhui Peoples R China City Univ Hong Kong Dept Comp Sci Hong Kong Hong Kong Peoples R China Guangdong Univ Foreign Studies Guangzhou 510006 Guangdong Peoples R China

Semantic annotation on natural language texts labels the meaning of an annotated element in specific contexts, and thus is an essential procedure for domain knowledge acquisition. An extensible and coherent annotation method is crucial for knowledge engineers to reduce human efforts to keep annotations consistent. This article proposes a comprehensive semantic annotation approach supported by a user-oriented markup language named UOML to enhance annotation efficiency with the aim of building a high quality knowledge base. UOML is operable by human annotators and convertible to formal knowledge representation languages. A pattern-based annotation conversion method named PAC is further proposed for knowledge exchange by utilizing automatic pattern learning. We designed and implemented a semantic annotation platform Annotation Assistant to test the effectiveness of the approach. By applying this platform in a long-term international research project for more than three years aiming at high quality knowledge acquisition from a classical Chinese poetry corpus containing 52,621 Chinese characters, we effectively acquired 150,624 qualified annotations. Our test shows that the approach has improved operational efficiency by 56.8%, on average, compared with text-based manual annotation. By using UOML, PAC achieved a conversion error ratio of 0.2% on average, significantly improving the annotation consistency compared with baseline annotations. The results indicate the approach is feasible for practical use in knowledge acquisition and conversion.

关键词： Knowledge engineering methodologies and tools ontologies pattern learning semi-structured data and xml

来源：评论

学校读者我要写书评

暂无评论

Anonymizing graphs: measuring quality for clustering

引用

KNOWLEDGE AND INFORMATION SYSTEMS 2015年第3期44卷 507-528页

作者： Casas-Roma, Jordi Herrera-Joancomarti, Jordi Torra, Vicenc UOC Barcelona Spain Univ Autonoma Barcelona Bellaterra Spain CSIC Artificial Intelligence Res Inst IIIA Bellaterra Spain

Anonymization of graph-based data is a problem, which has been widely studied last years, and several anonymization methods have been developed. Information loss measures have been carried out to evaluate the noise introduced in the anonymized data. Generic information loss measures ignore the intended anonymized data use. When data has to be released to third-parties, and there is no control on what kind of analyses users could do, these measures are the standard ones. In this paper we study different generic information loss measures for graphs comparing such measures to the cluster-specific ones. We want to evaluate whether the generic information loss measures are indicative of the usefulness of the data for subsequent data mining processes.

关键词： Privacy Networks data mining Mining methods and algorithms Quality and Metrics semi-structured data and xml

来源：评论

学校读者我要写书评

暂无评论

A novel methodology for retrieving infographics utilizing structure and message content

引用

data & KNOWLEDGE ENGINEERING 2015年第PartB期100卷 191-210页

作者： Li, Zhuo Carberry, Sandra Fang, Hui McCoy, Kathleen F. Peterson, Kelly Stagitis, Matthew Univ Delaware Dept Comp & Informat Sci Newark DE 19716 USA Univ Delaware Dept Elect & Comp Engn Newark DE 19716 USA

Information graphics (infographics) in popular media are highly structured knowledge representations that are generally designed to convey an intended message. This paper presents a novel methodology for retrieving infographics from a digital library that takes into account a graphic's structural and message content. The retrieval methodology can be summarized thus: 1) hypothesize requisite structural and message content from a natural language query, 2) measure the relevance of each candidate infographic to the requisite structural and message content hypothesized from the user query, and 3) integrate these relevance measurements via a linear combination model in order to produce a ranked list of infographics in response to the user query. The methodology has been implemented and evaluated, and it significantly outperforms a baseline method that treats queries and graphics as bags of words. (C) 2015 Published by Elsevier B.V.

关键词： semi-structured data and xml Information retrieval Digital libraries Query Graphic retrieval Natural language query processing Short document expansion Linear combination ranking model

来源：评论

学校读者我要写书评

暂无评论

Hierarchical clustering of xml documents focused on structural components

引用

data & KNOWLEDGE ENGINEERING 2013年 84卷 26-46页

作者： Costa, Gianni Manco, Giuseppe Ortale, Riccardo Ritacco, Ettore Italian Natl Res Council CNR Inst High Performance Comp & Networks ICAR I-87036 Arcavacata Di Rende CS Italy

Clustering xml documents by structure is the task of grouping them by common structural components. Hitherto, this has been accomplished by looking at the occurrence of one preestablished type of structural components in the structures of the xml documents. However, the a-priori chosen structural components may not be the most appropriate for effective clustering. Moreover, it is likely that the resulting clusters exhibit a certain extent of inner structural inhomogeneity, because of uncaught differences in the structures of the xml documents, due to further neglected forms of structural components. To overcome these limitations, a new hierarchical approach is proposed, that allows to consider (if necessary) multiple forms of structural components to isolate structurally-homogeneous clusters of xml documents. At each level of the resulting hierarchy, clusters are divided by considering some type of structural components (unaddressed at the preceding levels), that still differentiate the structures of the xml documents. Each cluster in the hierarchy is summarized through a novel technique, that provides a clear and differentiated understanding of its structural properties. A comparative evaluation over both real and synthetic xml data proves that the devised approach outperforms established competitors in effectiveness and scalability. Cluster summarization is also shown to be very representative. (C) 2012 Elsevier B.V. All rights reserved.

关键词： data Mining semi-structured data and xml xml clustering xml transactional representation xml cluster summarization and representative

来源：评论

学校读者我要写书评

暂无评论

Efficiency frontiers of xml cardinality constraints

引用

data & KNOWLEDGE ENGINEERING 2013年 87卷 297-319页

作者： Ferrarotti, Flavio Hartmann, Sven Link, Sebastian Victoria Univ Wellington Sch Informat Management Wellington New Zealand Tech Univ Clausthal Dept Informat Clausthal Zellerfeld Germany Univ Auckland Dept Comp Sci Auckland 1 New Zealand

xml has gained widespread acceptance as a premier format for publishing, sharing and manipulating data through the web. While the semi-structured nature of xml provides a high degree of syntactic flexibility there are significant shortcomings when it comes to specifying the semantics of xml data. For the advancement of xml applications it is therefore a major challenge to discover natural classes of constraints that can be utilized effectively by xml data engineers. This endeavor is ambitious given the multitude of intractability results that have been established. We investigate a class of xml cardinality constraints that is precious in the sense that it keeps the right balance between expressiveness and efficiency of maintenance. In particular, we characterize the associated implication problem axiomatically and develop a low-degree polynomial time algorithm that can be readily applied for deciding implication. Our class of constraints is chosen near-optimal as already minor extensions of its expressiveness cause potential intractability. Finally, we transfer our findings to establish a precious class of soft cardinality constraints on xml data. Soft cardinality constraints need to be satisfied on average only, and thus permit violations in a controlled manner. Soft constraints are therefore able to tolerate exceptions that frequently occur in practice, yet can be reasoned about efficiently. (C) 2012 Elsevier B.V. All rights reserved.

关键词： semi-structured data and xml database semantics database constraints Cardinality constraints Cardinality estimation

来源：评论

学校读者我要写书评

暂无评论

Efficiency frontiers of xml cardinality constraints

Efficiency frontiers of XML cardinality constraints

引用

30th International Conference on Conceptual Modeling (ER)

关键词： semi-structured data and xml database semantics database constraints Cardinality constraints Cardinality estimation

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：