Elastic distributed storage systems have been increasingly studied in recent years because power consumption has become a major problem in data centers. Much progress has been made in improving the agility of resizing...
详细信息
ISBN:
(纸本)9781538639146
Elastic distributed storage systems have been increasingly studied in recent years because power consumption has become a major problem in data centers. Much progress has been made in improving the agility of resizing small- and large-scale distributed storage systems. However, most of these studies focus on metadata based distributed storage systems. On the other hand, emerging consistent hashing based distributed storage systems are considered to allow better scalability and are highly attractive. We identify challenges in achieving elasticity in consistent hashing based distributed storage. These challenges cannot be easily solved by techniques used in current studies. In this paper, we propose an elastic consistent hashing based distributed storage to solve two problems. First, in order to allow a distributed storage to resize quickly, we modify the data placement algorithm using a primary server design and achieve an equal-work data layout. Second, we propose a selective data re-integration technique to reduce the performance impact when resizing a cluster. Our experimental and trace analysis results confirm that our proposed elastic consistent hashing works effectively and allows significantly better elasticity.
Modern computer systems are becoming increasingly heterogeneous by comprising multi-core CPUs, GPUs, and other accelerators. Current programming approaches for such systems usually require the application developer to...
详细信息
ISBN:
(纸本)9780769546766
Modern computer systems are becoming increasingly heterogeneous by comprising multi-core CPUs, GPUs, and other accelerators. Current programming approaches for such systems usually require the application developer to use a combination of several programming models (e. g., MPI with OpenCL or CUDA) in order to exploit the full compute capability of a system. In this paper, we present dOpenCL (distributed OpenCL) - a uniform approach to programming distributed heterogeneous systems with accelerators. dOpenCL extends the OpenCL standard, such that arbitrary computing devices installed on any node of a distributed system can be used together within a single application. dOpenCL allows moving data and program code to these devices in a transparent, portable manner. Since dOpenCL is designed as a fully-fledged implementation of the OpenCL API, it allows running existing OpenCL applications in a heterogeneous distributed environment without any modifications. We describe in detail the mechanisms that are required to implement OpenCL for distributedsystems, including a device management mechanism for running multiple applications concurrently. Using three application studies, we compare the performance of dOpenCL with MPI+OpenCL and a standard OpenCL implementation.
Association rule mining is one of the most important techniques in data mining. It extracts significant patterns from transaction databases and generates rules used in many decision support applications. Many organiza...
详细信息
ISBN:
(纸本)9781424418893
Association rule mining is one of the most important techniques in data mining. It extracts significant patterns from transaction databases and generates rules used in many decision support applications. Many organizations such as industrial, commercial, or even scientific sites may produce large amount of transactions and attributes. Mining effective rules from such large volumes of data requires much time and computing resources. In this paper, we propose a parallel FI-growth association rule mining algorithm for rapid extraction of frequent itemsets from large dense databases. We also show that this algorithm can efficiently be parallelized in a cluster computing environment. The preliminary experiments provide quite promising results, with nearly ideal scaling on small clusters and about half of ideal (15 fold speedup) on a thirty-two processor cluster.
More and more massive parallel codes running on several hundreds of thousands of cores enter the computational science and engineering domain, allowing high-fidelity computations up to trillions of unknowns for very d...
详细信息
ISBN:
(纸本)9781509041527
More and more massive parallel codes running on several hundreds of thousands of cores enter the computational science and engineering domain, allowing high-fidelity computations up to trillions of unknowns for very detailed analyses of the underlying problems. During such runs, typically gigabytes of data are being produced, hindering both efficient storage and (interactive) data exploration. Here, advanced approaches based on inherently distributed data formats such as HDF5 become necessary in order to avoid long latencies when storing the data and to support fast (random) access when retrieving the data for visual processing. Avoiding file locking and using collective buffering, we achieved write bandwiths to a single file close to the theoretical peak on a modern supercomputing cluster. The structure of our output file supports a very fast interactive visualisation and introduces additional steering functionality.
Runtime verification is a lightweight automated formal method for specification-based runtime monitoring as well as testing of large real-world systems. While numerous techniques exist for runtime verification of sequ...
详细信息
ISBN:
(纸本)9781479986484
Runtime verification is a lightweight automated formal method for specification-based runtime monitoring as well as testing of large real-world systems. While numerous techniques exist for runtime verification of sequential programs, there has been very little work on specification-based monitoring of distributedsystems. In this paper, we propose the first sound and complete method for runtime verification of asynchronous distributed programs for the 3-valued semantics of LTL specifications defined over the global state of the program. Our technique for evaluating LTL properties is inspired by distributed computation slicing, an approach for abstracting distributed computations with respect to a given predicate. Our monitoring technique is fully decentralized in that each process in the distributed program under inspection maintains a replica of the monitor automaton. Each monitor may maintain a set of possible verification verdicts based upon existence of concurrent events. Our experiments on runtime monitoring of a simulated swarm of flying drones show that due to the design of our Algorithm, monitoring overhead grows only in the linear order of the number of processes and events that need to be monitored.
作者:
Wang, YanWang, XinFudan Univ
Sch Comp Sci Shanghai Key Lab Intelligent Informat Proc Shanghai 200433 Peoples R China
distributed storage systems (DSS) play an important role in data storage applications, since they provide high reliability for huge data storage requirement. As node failures are frequent in a large distributed storag...
详细信息
ISBN:
(纸本)9780769546766
distributed storage systems (DSS) play an important role in data storage applications, since they provide high reliability for huge data storage requirement. As node failures are frequent in a large distributed storage system, the performance of repairing node failure causes many researchers' interests. In this paper, we propose a distributed storage code to minimize the coding complexity during the repairing process, at a cost of inducing larger redundancy. Our code construction is based on regular graphs and exploits simple look-up repair. We analyze the performance of the proposed code, and compare them with existing distributed storage codes. Analytical results show that the proposed code outperforms the others in terms of low repair complexity and disk I/O overhead.
Linearizability is a well-known consistency condition for shared objects in concurrent systems. We focus on the problem of implementing linearizable objects of arbitrary data types in message-passing systems with boun...
详细信息
ISBN:
(纸本)9780769552071
Linearizability is a well-known consistency condition for shared objects in concurrent systems. We focus on the problem of implementing linearizable objects of arbitrary data types in message-passing systems with bounded, but uncertain, message delay and bounded, but non-zero, clock skew. We present an algorithm that exploits axiomatic properties of different operations to reduce the running time of each operation below that obtainable with previously known algorithms. We also prove lower bounds on the time complexity of various kinds of operations, specified by the axioms they satisfy, resulting in reduced gaps in some cases and tight bounds in others.
Many data-intensive applications, such as distributed deep learning and data analytics, require moving vast amounts of data between compute servers in a distributed system. To meet the demands of these applications, d...
详细信息
ISBN:
(纸本)9781665481069
Many data-intensive applications, such as distributed deep learning and data analytics, require moving vast amounts of data between compute servers in a distributed system. To meet the demands of these applications, datacenters are adopting Remote Direct Memory Access (RDMA), which has higher bandwidth and lower latency than traditional kernel-based networking. To ensure high performance of RDMA networks, congestion control manages queue depth on switches, and historically focused on moderating queue depth to ensure small flows complete quickly. Unfortunately, one side-effect of many common decisions is that large flows are starved of bandwidth. This negatively impacts the flow completion time (FCT) of large, bandwidth-bound flows, which are integral to the performance of data-intensive applications. The FCT is particularly impacted at the tail, which is increasingly critical for predictable application performance. We identify the root causes of the poor performance for long flows and measure the impact. We then design mechanisms that improve long flow FCT without compromising small flow performance. Our evaluations show that these improvements reduce 99.9% tail FCT of long flows by over 2x.
The problem of the combined communication network design and file allocation for distributeddatabases is addressed. It consists of finding the allocation of database files over a set of computer sites and of the desi...
详细信息
The problem of the combined communication network design and file allocation for distributeddatabases is addressed. It consists of finding the allocation of database files over a set of computer sites and of the design of the communication network, i. e. , the design of the network topology, and the allocation of the communication channel capacities. The objective is to minimize the total cost of storing the database files and of leasing the communication channels subject to the constraints of network reliability, file availability, and communication delay. The network topologies are restricted to be of maximal connectivity and minimal diameter. A heuristic algorithm to solve the problem is described and some results are presented.
As the explosive growth of energy consumption in current heterogeneous distributedsystems, energy consumption constraint has been one of the primary design issues Minimizing the schedule length while satisfying the e...
详细信息
ISBN:
(纸本)9781538637906
As the explosive growth of energy consumption in current heterogeneous distributedsystems, energy consumption constraint has been one of the primary design issues Minimizing the schedule length while satisfying the energy consumption constraint of parallel applications is one of the most important problem which has been studied recently. Previous studies have proposed a preassignment approach which tried to presuppose the minimum energy consumption assignment for unassigned tasks to solve the problem based on the dynamic voltage and frequency scaling (DVFS) technique. However, the preassignment of unassigned tasks with the minimum energy consumption does not necessarily lead to the minimization of the schedule length. In this study, we propose an efficient scheduling algorithm using a relative average assignments for tasks. The results of experiments on two real parallel applications validate that the proposed algorithm can obtain shorter schedule length while satisfying the energy consumption constraint compared with the state-ofthe-art methods in various situations.
暂无评论