We consider the problem of designing efficient parallel algorithms for summing and prefix summing. In this paper, we present optimal algorithms for summing on a latency-dependent distributed-memory model and show that...
详细信息
We consider the problem of designing efficient parallel algorithms for summing and prefix summing. In this paper, we present optimal algorithms for summing on a latency-dependent distributed-memory model and show that any optimal summing algorithm must have an inherent structure. Moreover, we present optimal or near-optimal algorithms for prefix summing for both non-commutative and commutative binary operators. Furthermore, we show that the optimal algorithms for prefix summing for these two types of operators are not equivalent.
The paper presents a method to integrate parallelism in the DIPLIB sequential image processing library. The library contains several framework functions for different types of operations. We parallelize the filter fra...
详细信息
ISBN:
(纸本)354067442X
The paper presents a method to integrate parallelism in the DIPLIB sequential image processing library. The library contains several framework functions for different types of operations. We parallelize the filter framework function (contains the neighborhood image processing operators). We validate our method by testing it with the geometric mean filter. Experiments on a cluster of workstations show linear speedup.
Approximation of partial differential equations of hyperbolic type by a set of ordinary differential equations is presented. The method of weighted-residual is applied. The Galerkin method and the finite element metho...
详细信息
ISBN:
(纸本)0780312813
Approximation of partial differential equations of hyperbolic type by a set of ordinary differential equations is presented. The method of weighted-residual is applied. The Galerkin method and the finite element method are presented as examples.
We describe the implementation of virtual processing for several combinatorial algorithms using the MPL language on a 16,384-processor MasPar MP-1. In coding these routines, we tried different underlying (deterministi...
详细信息
We describe the implementation of virtual processing for several combinatorial algorithms using the MPL language on a 16,384-processor MasPar MP-1. In coding these routines, we tried different underlying (deterministic and randomized) algorithms. We present the performance data for our different implementations. We also present general code rewriting rules for converting a code that uses no virtual processor into a code with virtual processing.
We empirically analyze and compare two distributed, low-overhead policies for scheduling dynamic tree-structured computations on rings of identical PEs. Our experiments show that both policies give significant paralle...
详细信息
We empirically analyze and compare two distributed, low-overhead policies for scheduling dynamic tree-structured computations on rings of identical PEs. Our experiments show that both policies give significant parallel speedup on large classes of computations, and that one yields almost optimal speedup on moderate size rings. We believe that our methodology of experiment design and analysis will prove useful in other such studies.
Multicore processors are nowadays widespread across desktop, laptop, server, and even smartphone and tablets devices. The rise of such powerful execution environments calls for new parallel and distributed Description...
详细信息
ISBN:
(纸本)9781479941162
Multicore processors are nowadays widespread across desktop, laptop, server, and even smartphone and tablets devices. The rise of such powerful execution environments calls for new parallel and distributed Description Logics (DLs) reasoning algorithms. Many sophisticated optimizations have been explored and have considerably enhanced DL reasoning with light ontologies. Non-determinism remains a main source of complexity for implemented systems handling ontologies relying on more expressive logics. In this work, we explore handling non-determinism with DL languages enabling qualified cardinality restrictions. We implement a fork/join parallel framework into our hybrid algebraic reasoner, which handles qualified cardinality restrictions and nominals using in-equation solving. Preliminary evaluation shows encouraging results.
The presentation of Peachy parallel Assignments in several workshops on parallel and distributed computing education aims to promote the reuse of highquality assignments, both saving precious faculty time and improvin...
详细信息
ISBN:
(纸本)9781665497473
The presentation of Peachy parallel Assignments in several workshops on parallel and distributed computing education aims to promote the reuse of highquality assignments, both saving precious faculty time and improving the quality of course assignments. Presented assignments are selected competitively- they must have been successfully used in a real classroom, be easy for other instructors to adopt, and be "cool and inspirational" to encourage students to spend time on them and talk about them with others. Winning assignments are also archived on the Peachy parallel Assignments website. In this installment of Peachy parallel Assignments, we present three new assignments. The first assignment is to simulate an Abelian Sandpile, with grains of sand moving from tall piles to shorter ones. This is a discrete simulation that creates colorful and intricate images. The second assignment is a Big Data problem in which students use the MapReduce paradigm to recreate "Warming Stripes", a visualization of climate data that highlights climate change. The third assignment introduces climate-oriented optimization by asking students to schedule distributed workflows to minimize their carbon footprint.
This paper proposes a design methodology for building highly available systems. In addition, we describe a set of operating system services that can be used to achieve this goal. The techniques described are intended ...
详细信息
This paper proposes a design methodology for building highly available systems. In addition, we describe a set of operating system services that can be used to achieve this goal. The techniques described are intended for a parallel environment and can be generalized for any distributed system. We describe a methodology for providing basic services for high availability, specific services for restart and an implementation of these services.
MRNet is an infrastructure that provides scalable multicast and data aggregation functionality for distributed tools. While evaluating MRNet's performance and scalability, we learned several important lessons abou...
详细信息
ISBN:
(纸本)0769521320
MRNet is an infrastructure that provides scalable multicast and data aggregation functionality for distributed tools. While evaluating MRNet's performance and scalability, we learned several important lessons about benchmarking large-scale, distributed tools and middleware. First, automation is essential for a successful benchmarking effort, and should be leveraged whenever possible during the benchmarking process. Second, microbenchmarking is invaluable not only for establishing the performance of low-level functionality, but also for design verification and debugging. third, resource management systems need substantial improvements in their support for running tools and applications together. Finally, the most demanding experiments should be attempted early and often during a benchmarking effort to increase the chances of detecting problems with the tool and experimental methodology.
暂无评论