Link-based similarity measures play significant role in many graph based applications. Consequently, measuring nodes similarity in a graph is a fundamental problem of graph data mining. Personalized PageRank (PPR) and...
详细信息
ISBN:
(纸本)9783319080109;9783319080093
Link-based similarity measures play significant role in many graph based applications. Consequently, measuring nodes similarity in a graph is a fundamental problem of graph data mining. Personalized PageRank (PPR) and SimRank (SR) have emerged as the most popular and influential link-based similarity measures. In practice, PPR and SR scores are achieved by iterative computing. With increasing of iterations, the computations incur heavy overhead. The ideal solution is that computing similarity within the minimum number of iterations is sufficient to guarantee a desired accuracy. However, the existing upper bounds are too coarse to be useful in general. Therefore, we focus on designing accurate and tight upper bounds of PPR and SR in the paper. Our upper bounds are designed based on following human intuition: "the smaller the difference between the two consecutive iteration step results is, the smaller the difference between iterative similarity scores and theoretical ones is". Furthermore, we demonstrate effectiveness of our novel upper bounds in the scenario of top-k similar nodes query, where our upper bounds accelerate speed of the query. At last, we run a comprehensive set of experiments on real data sets to verify effectiveness and efficiency of our upper bounds
A Top-k aggregate query, which is a powerful technique when dealing with large quantity of data, ranks groups of tuples by their aggregate values and returns k groups with the highest aggregate values. However, compar...
详细信息
Online customer review is considered as a significant informative resource which is useful for both potential customer and product manufacturers. As a result, it is one of the most challenging tasks to mine customer r...
详细信息
Online customer review is considered as a significant informative resource which is useful for both potential customer and product manufacturers. As a result, it is one of the most challenging tasks to mine customer reviews automatically and to provide users with opinion summary. Product features and opinion word play the most important roles in the customers' opinions mining. In this paper, we dedicate our work to opinion word mining. We proposed an approach for opinion word identification based on the association rule mining algorithm. The method makes full use of co-occurrence syntactic characteristic between product features and opinion word. Firstly, the product feature is identified by two-stage filtering scheme, and secondly the opinion word is extracted through association rule mining. The final experiment results show that the proposed method could not only obtain the product features related to domain characteristics, but identify the opinion word effectively. Meanwhile, our approach possesses much higher precision and recall than Hu's work.
In this paper, a rapid resynchronization method using intent logs is suggested for replication in-memory databases supporting mobile communication applications. Both the identifiers of unsynchronized segments and the ...
详细信息
In this paper, a rapid resynchronization method using intent logs is suggested for replication in-memory databases supporting mobile communication applications. Both the identifiers of unsynchronized segments and the identifiers of the slaves who have missed the updates in the segments are recorded in the intent logs. When receiving a resynchronization request from a slave, the master will scan the intent logs to find the unsynchronized segments for the salve, and then send the segments gotten directly from its memory to the slave. The performance results shown the intent logs method can reduce resynchronization time than methods using transaction logs.
Reverse Skyline Queries have been proved very useful in business location, environmental monitoring and some other applications. In this paper, we consider reverse skyline queries processing on data stream, which prov...
详细信息
Reverse Skyline Queries have been proved very useful in business location, environmental monitoring and some other applications. In this paper, we consider reverse skyline queries processing on data stream, which provides continuous, high-speed data elements. Specifically, we consider the latest objects in the sliding window. The challenge is that it is difficult to maintain a multidimensional index (for example, R-tree) in a dynamic dataset. Focusing on this challenge, we propose an algorithm with a DC-Tree as index and effective pruning methods to reduce the search space of query processing and the cost of index maintaining. Extensive experiments show that our algorithms are efficient and effective for on-line reverse skyline query.
Recently, more and more authors have been encouraged for collaboration because it often produces good results. However, the author collaboration network contains experts in various research directions within various f...
详细信息
The deadline effect and threshold effect are common in product development and group buying. We consider these two effects also exist in crowdfunding, affecting the crowdfunding performance in the future. We conduct a...
详细信息
The deadline effect and threshold effect are common in product development and group buying. We consider these two effects also exist in crowdfunding, affecting the crowdfunding performance in the future. We conduct an empirical study using a dataset from *** to examine these two effects and their interactive relationship on the following crowdfunding performance. We find the threshold effect exists in crowdfunding and has a positive influence on the following contribution behaviour. The deadline effect also can stimulate the crowdfunding behaviour via limited fundraising time. Furthermore, we find the threshold effect and deadline effect are complements that increase the amount pledged. (C) 2020 The Authors. Published by Elsevier B.V.
A survey about the information needs of elderly people could find out the information required to address the needs of the aged in a community. Analyzing data collected from 600 elderly people through
A survey about the information needs of elderly people could find out the information required to address the needs of the aged in a community. Analyzing data collected from 600 elderly people through
A survey about the information needs of elderly people could find out the information required to address the needs of the aged in a community. Analyzing data collected from 600 elderly people through field investigat...
详细信息
A survey about the information needs of elderly people could find out the information required to address the needs of the aged in a community. Analyzing data collected from 600 elderly people through field investigation with a questionnaire in a rural community in central China, the results show that the preferred information format of the majority of aged people is audio and/or visual information product, especially audio product. Most of the aged people stated that they were in need of healthy and medical non-educational audio information products. The survey maybe lead to improved and expanded information services for respondents who are short of such services, including Public broadcasting services, extending the audiovisual collection, loaning audiovisuals, religious faith audiovisuals and others providing needed information to them. In summary, this paper assembles views on what the elderly people currently need to be helped by both practitioners and researchers in the elderly people services domain.
This paper presents an overview of the INEX 2011 data-Centric Track. Having the ad hoc search task running its second year, we introduced a new task, faceted search task, whose goal is to provide the infrastructure to...
详细信息
暂无评论