A fundamental problem in large scale, decentralized distributed systems is the efficient discovery of information. This paper presents Squid, a peer-to-peer information discovery system that supports flexible searches...
详细信息
A fundamental problem in large scale, decentralized distributed systems is the efficient discovery of information. This paper presents Squid, a peer-to-peer information discovery system that supports flexible searches and provides search guarantees. The fundamental concept underlying the approach is the definition of multi-dimensional information spaces and the maintenance of locality in these spaces. The key innovation is a dimensionality reducing indexing scheme that effectively maps the multi-dimensional information space to physical peers while preserving lexical locality. Squid supports complex queries containing partial keywords, wildcards and ranges. Analytical and simulation results show that Squid is scalable and efficient. (c) 2008 Elsevier Inc. All rights reserved.
A layered model of structured overlays has been proposed and it enabled development of a routing layer independently of higher-level services such as DHT and multicast. The routing layer has to include other part than...
详细信息
A layered model of structured overlays has been proposed and it enabled development of a routing layer independently of higher-level services such as DHT and multicast. The routing layer has to include other part than a routing algorithm, which is essential for routing. It is routing process, which is common to various routing algorithms and can be decoupled from a routing algorithm. We demonstrated the decomposition by implementing an overlay construction toolkit Overlay Weaver. It facilitates implementation of routing algorithms and we could multiple well-known algorithms just in hundreds of lines of code with the toolkit. The decomposition also enables multiple implementations of the common routing process. Two implementations the toolkit provides perform iterative and recursive routing, respectively. Additionally, to our knowledge, the toolkit is the first feasibility proof of the layered model by supporting multiple algorithms and the higher-level services. Such modular design contributes to our goal, which is facilitation of rapid development of realistic routing algorithms and their application. We demonstrates that Overlay Weaver supports the goal by conducting large-scale tests and comparisons of algorithms on a single computer. The resulting algorithm implementations work on a real TCP/IP network as it is. (C) 2007 Elsevier B.V. All rights reserved.
The load balance is a critical issue of distributed hash table (DHT), and the previous work shows that there exists O(logn) imbalance of load in Chord. The load distribution of Chord, Pastry, and the virtual servers (...
详细信息
The load balance is a critical issue of distributed hash table (DHT), and the previous work shows that there exists O(logn) imbalance of load in Chord. The load distribution of Chord, Pastry, and the virtual servers (VS) balancing scheme and deduces the closed form expressions of the probability density function (PDF) and cumulative distribution function (CDF) of the load in these DHTs is analyzes. The analysis and simulation show that the load of all these DHTs obeys the gamma distribution with similar formed parameters.
Many peer-to-peer overlay operations are inherently parallel and this parallelism can be exploited by using multi-destination multicast routing, resulting in significant message reduction in the underlying network. We...
详细信息
Many peer-to-peer overlay operations are inherently parallel and this parallelism can be exploited by using multi-destination multicast routing, resulting in significant message reduction in the underlying network. We propose criteria for assessing when multicast routing can effectively be used, and compare multi-destination multicast and host group multicast using these criteria. We show that the assumptions underlying the Chuang-Sirbu multicast scaling law are valid in large-scale peer-to-peer overlays, and thus Chuang-Sirbu is suitable for estimating the message reduction when replacing unicast overlay messages with multicast messages. Using simulation, we evaluate message savings in two overlay algorithms when multi-destination multicast routing is used in place of unicast messages. We further describe parallelism in a range of overlay algorithms including multi-hop, variable-hop, load-balancing, random walk, and measurement overlay. (c) 2007 Elsevier B.V. All rights reserved.
Users in a peer-to-peer (P2P) system join and leave the network in a continuous manner. Understanding the resilience properties of P2P systems under high rate of node churn becomes important. In this work, we first fi...
详细信息
Users in a peer-to-peer (P2P) system join and leave the network in a continuous manner. Understanding the resilience properties of P2P systems under high rate of node churn becomes important. In this work, we first find that a lifetime-based dynamic churn model for a P2P network that has reached stationarity is reducible to a uniform node failure model. This is a simple yet powerful result that bridges the gap between the complex dynamic churn models and the more tractable uniform failure model. We further develop the reachable component method and derive the routing performance of a wide-range of structured P2P systems under varying rates of churn. We find that the de Bruijn graph based routing systems offer excellent resilience under extremely high rate of node turnovers, followed by a group of routing systems that include CAN, Kademlia, Chord and randomized-Chord. We show that our theoretical predictions agree well with large-scale simulation results. We finish by suggesting methods to further improve the routing performance of dynamic P2P systems in the presence of churn and failures. (C) 2008 Elsevier B.V. All rights reserved.
Application layer peer to peer (P2P) network technology is widely regarded as the most important development for next generation Internet infra-structure. For these systems to be effective, load balancing among the pe...
详细信息
Application layer peer to peer (P2P) network technology is widely regarded as the most important development for next generation Internet infra-structure. For these systems to be effective, load balancing among the peers is critical. Early structured P2P systems rely on the randomness of object ID generated with a dynamic hash function to avoid the load imbalance issue. This has been known to result in an imbalance factor of O (log N) in the number of items stored at a node. This paper makes two contributions. First, based on previous work, we propose a simple yet extremely effective extension. We demonstrate the superior performance of our proposal and also explore other important issues vital to the performance for the virtual server framework, such as the effect of the number of directories employed in the system, and the performance ramification of user registration strategies. Secondly, and more significantly, we characterize systematically the effect of heterogeneity on load balancing algorithm performance, and the conditions in which heterogeneity may be easy or hard to deal with. We show how previous results may be valid only in the simpler settings.
The past few years have seen tremendous advances in distributed storage infrastructure. Unstructured and structured overlay networks have been successfully used in a variety of applications, ranging from file-sharing ...
详细信息
The past few years have seen tremendous advances in distributed storage infrastructure. Unstructured and structured overlay networks have been successfully used in a variety of applications, ranging from file-sharing to scientific data repositories. While unstructured networks benefit from low maintenance overhead, the associated search costs are high. On the other hand, structured networks have higher maintenance overheads, but facilitate bounded time search of installed keywords. When dealing, with typical data sets, though, it is infeasible to install every possible search term as a keyword into the structured overlay. State-of-the art semantic indexing techniques have been successfully integrated into peer-to-peer (P2P) systems using semantic overlays. However, exiting approaches are based on the premise that the fundamental ingredient of semantic indexing, a semantic basis for the underlying data. is globally available, which is not likely to be the case in practice. Therefore, development of techniques to efficiently compute basis vectors for data distributed across peers is important for large-scale deployment of semantic indexing in P2P systems. In this paper, we present a novel structured overlay that integrates aspects of semantic indexing using non-orthogonal matrix decompositions, with the hash structure of the overlay. We adopt PROXIMUS, a recursive decomposition method for computing concise representations for binary data sets, to locally identify latent patterns in data distributed across peers. To enable efficient consolidation of patterns, we rely on distributed hash tables (DHT), commonly used in various applications in P2P networks. The discrete nature of non-orthogonal matrix decomposition is well suited to the binary key structure of DHTs, resulting in an indexing method, PMINER, that enables the network to deliver efficient and accurate responses to semantic queries. We present the algorithmic underpinnings of PMINER and demonstrate its excelle
distributed hash table (DHT) networks based on consistent hashing functions have an inherent load uneven distribution problem. The objective of DHT load balancing is to balance the workload of the network nodes in pro...
详细信息
distributed hash table (DHT) networks based on consistent hashing functions have an inherent load uneven distribution problem. The objective of DHT load balancing is to balance the workload of the network nodes in proportion to their capacity so as to eliminate traffic bottleneck. It is challenging because of the dynamism. proximity and. heterogeneity natures of DHT networks and time-varying load characteristics. In this paper. we present a hash-based proximity clustering approach for load balancing in heterogeneous DHTs. In the approach, DHT nodes are classified as regular nodes and supernodes according to their computing and networking capacities. Regular nodes are grouped and associated with supernodes via consistent hashing of their physical proximity information on the Internet. The supernodes form a self-organized and churn-resilient auxiliary network for load balancing. The hierarchical structure facilitates the design and implementation of a locality-aware randomized (LAR) load balancing algorithm. The algorithm introduces a factor of randomness in the load balancing processes in a range of neighborhood so as to deal with both the proximity and dynamism. Simulation results show the superiority of the clustering approach with LAR. in comparison with a number of other DHT load balancing algorithms. The approach performs no worse than existing proximity-aware algorithms and exhibits strong resilience to the effect of churn. It also greatly reduces the overhead of resilient randomized load balancing due to the use of proximity information. (C) 2007 Elsevier Inc. All rights reserved.
In order to provide high data availability in peer-to-peer (P2P) DHTs, proper data redundancy schemes are required. This paper compares two popular schemes: replication and erasure coding. Unlike previous comparison, ...
详细信息
In order to provide high data availability in peer-to-peer (P2P) DHTs, proper data redundancy schemes are required. This paper compares two popular schemes: replication and erasure coding. Unlike previous comparison, we take user download behavior into account. Furthermore, we propose a hybrid redundancy scheme, which shares user downloaded files for subsequent accesses and utilizes erasure coding to adjust file availability. Comparison experiments of three schemes show that replication saves more bandwidth than erasure coding, although it requires more storage space, when average node availability is higher than 47%;moreover, our hybrid scheme saves more maintenance bandwidth with acceptable redundancy factor.
In order to provide high data availability in peer-to-peer (P2P) DHTs, proper data redundancy schemes are required. This paper compares two popular schemes: replication and erasure coding. Unlike previous comparison, ...
详细信息
In order to provide high data availability in peer-to-peer (P2P) DHTs, proper data redundancy schemes are required. This paper compares two popular schemes: replication and erasure coding. Unlike previous comparison, we take user download behavior into account. Furthermore, we propose a hybrid redundancy scheme, which shares user downloaded files for subsequent accesses and utilizes erasure coding to adjust file availability. Comparison experiments of three schemes show that replication saves more bandwidth than erasure coding, although it requires more storage space, when average node availability is higher than 47%;moreover, our hybrid scheme saves more maintenance bandwidth with acceptable redundancy factor.
暂无评论