SimRank is a well-known algorithm for similarity calculation based on object-to-object relationship. However, it suffers from high computation cost. Inthis paper, we find that the convergence behavior of different obj...
详细信息
ISBN:
(纸本)9783642008863
SimRank is a well-known algorithm for similarity calculation based on object-to-object relationship. However, it suffers from high computation cost. Inthis paper, we find that the convergence behavior of different object pairs is different when we use SimRank to compute the similarity of objects. Many similarity scores converge fast, while others need more time before convergence. Based on this observation, we propose an adaptive method called Adaptive-SimRank to speed up similarity calculation. Using this method, we don't need to recalculate those converged pairs' similarity. The experiments conducted on web datasets and synthetic dataset show that our new method can reduce the running time by nearly 35%.
The paper describes the details of using J-SIM in main memory database parallel recovery simulation. In update intensive main memory database systems, I/O is still the dominant performance bottleneck. A proposal of pa...
详细信息
The paper describes the details of using J-SIM in main memory database parallel recovery simulation. In update intensive main memory database systems, I/O is still the dominant performance bottleneck. A proposal of parallel recovery scheme for large-scale update intensive main memory database systems is presented. Simulation provides a faster way of evaluating the new idea compared to actual system implementation. J-SIM is an open source discrete time simulation software package. The simulation implementation using J-SIM is elaborated in terms of resource modeling, transaction processing system modeling and workload modeling. Finally, with simulation results analyzed, the effectiveness of the parallel recovery scheme is verified and the feasibility of J-SIM's application in main memory database system simulation is demonstrated.
With the system becoming more complex and workloads becoming more fluctuating, it is very hard for DBA to quickly analyze performance data and optimize the system, self optimization is a promising technique. A data mi...
详细信息
database-as-a-Service (DAS) is an emerging database management paradigm wherein partition based index is an effective way to querying encrypted data. However, previous research either focuses on one-dimensional partit...
详细信息
ISBN:
(纸本)9781605586502
database-as-a-Service (DAS) is an emerging database management paradigm wherein partition based index is an effective way to querying encrypted data. However, previous research either focuses on one-dimensional partition or ignores multidimensional data distribution characteristic, especially sparsity and locality. In this paper, we propose Cluster based Onion Partition (COP), which is designed to decrease both false positive and dead space at the same time. Basically, COP is composed of two steps. First, it partition covered space level by level, which is like peeling of onion;second, at each level, a clustering algorithm based on local density is proposed to achieve local optimal secure partition. Extensive experiments on real dataset and synthetic dataset show that COP is a secure multidimensional partition with much less efficiency loss than previous top down or bottom up counterparts. Copyright 2009 ACM.
Influence between objects needs to be assessed in many applications. Lots of measures have been proposed, but a domain-independent method is still expected. In this paper, we give a probabilistic definition of influen...
详细信息
ISBN:
(纸本)9781424427659
Influence between objects needs to be assessed in many applications. Lots of measures have been proposed, but a domain-independent method is still expected. In this paper, we give a probabilistic definition of influence based on the random walker model on graphs. Two approaches, linear systems method and Basic InfRank algorithm, are shown and return equal results, but Basic InfRank is more efficient by iterative computation. Two variants on bipartite graphs and star graphs are discussed. Experiments show InfRank algorithms have good accuracy, fast convergent rate and high performance.
This paper addresses the problem of fault-tolerant many-to-one routing in static wireless networks with asymmetric links, which is important in both theoretical and practical aspects. The problem is to find a minimum ...
详细信息
Existing research on extreme value query in wireless sensor networks is mainly focus on finding out sensors with highest metric. Yet in most actually scenarios, people cares more about special network regions than det...
详细信息
Along with a massive amount of information being placed online, it is a challenge to exploit the internal and external information of documents when assessing similarity between them. A variety of approaches have been...
详细信息
In many real-world domains, link graph is one of the most effective ways to model the relationships between objects. Measuring the similarity of objects in a link graph is studied by many researchers, but an effective...
详细信息
Nearly all text classification methods classify texts into predefined categories according to the terms appeared in texts. State-of-the-art of text classification prefer to simplely take a word as a term since it perf...
详细信息
暂无评论