The authors explore the complexity of deciding whether an execution of a shared-memory multiprocessor is sequentially consistent. They present the first results showing the NP-completeness of this problem, even for sh...
详细信息
The authors explore the complexity of deciding whether an execution of a shared-memory multiprocessor is sequentially consistent. They present the first results showing the NP-completeness of this problem, even for short programs or small machines. They also explore possible augmentations to the memory system; a fast decision algorithm is presented for such an augmented shared memory. The results obtained demonstrate the difficulty in detecting when an execution of a memory system fails to be sequentially consistent, and supporting all possible sequentially consistent executions in hardware.< >
A modular reconfigurable serial architecture is presented for the analog/digital implementation of constrained optimization algorithms with digital programmability of the problem weights. Area overhead due to programm...
详细信息
A modular reconfigurable serial architecture is presented for the analog/digital implementation of constrained optimization algorithms with digital programmability of the problem weights. Area overhead due to programmability is reduced by using time multiplexing methodology. It allows all the weights of each multiple inputs processing unit to be digitally controlled by just using one weighted component array. The proposed architecture is very well suited for MOS VLSI realization using switched-capacitor (SC) techniques. SC schematics for the different building blocks are presented and demonstrated via empirical results.< >
The authors give efficient parallel algorithms to compute shortest-paths in planar layered digraphs. They show that these digraphs admit special kinds of separators, called one-way separators, which allow paths in the...
详细信息
The authors give efficient parallel algorithms to compute shortest-paths in planar layered digraphs. They show that these digraphs admit special kinds of separators, called one-way separators, which allow paths in the graph to cross them only once. They use these separators to give divide-and-conquer solutions to the problem of finding the shortest paths. They first give a simple algorithm that works on the CREW (concurrent-read exclusive-write) PRAM (parallel random-across machine) model and computes the shortest path between any two vertices of an n-node planar layered diagraph in time O(log/sup 3/ n) using n/log n processors. A CRCW (concurrent-read concurrent-write) version of this algorithm runs in O(log/sup 2/ n log log n) time and uses O(n/log log n) processors. The authors then improve the time bound to O(log/sup 2/ n) on the CREW model and O(log n log log n) on the CRCW model. The processor bounds still remain n log n for the CREW model and n/log log n for the CRCW model.< >
The OREGAMI project involves the design, implementation, and testing of algorithms for mapping parallel computations to message-passing parallelarchitectures. OREGAMI addresses the mapping problem by exploiting regul...
详细信息
The OREGAMI project involves the design, implementation, and testing of algorithms for mapping parallel computations to message-passing parallelarchitectures. OREGAMI addresses the mapping problem by exploiting regularity and by allowing the user to guide and evaluate mapping decisions made by OREGAMI's efficient combinatorial mapping algorithms. OREGAMI's approach to mapping is based on a new graph theoretic model of parallel computation called the Temporal Communication Graph. The OREGAMI software tools include three components: (I ) LaRCS is a graph description language which allows the user to describe regularity in the communication topology as well as the temporal communication behavior (the pattern of message-passing over time). (2) MAPPER is our library of mapping algorithms which utilize information provided by LaRCS to perform contraction, embedding, and routing. (3) METRICS is an interactive graphics tool for display and analysis of mappings. This paper gives an overview of the OREGAMI project, the software tools, and OREGAMI's mapping algorithms.
The split and merge model is a reasonable method for architecture-independent programming of global image processing operations on parallelarchitectures. We consider image connected components from the point of view ...
详细信息
A multidimensional binary partition (MBP) is a data structure determined by a set of points in n-dimensional space. On certain parallelarchitectures, this data structure can be easily distributed across the processin...
详细信息
A multidimensional binary partition (MBP) is a data structure determined by a set of points in n-dimensional space. On certain parallelarchitectures, this data structure can be easily distributed across the processing nodes of the machine and can provide a natural technique for load balancing and partitioning of application problems that depend on a distribution of dynamically changing points in multidimensional space. This paper describes parallel algorithms for generating and using MBPs on a hypercube parallel machine. It is also shown how these distributed data structures allow efficient parallel searches of the data set. The performance of an implementation of these algorithms on an NCUBE hypercube is presented.
In this paper, we introduce a formal approach for synthesis of parallelarchitectures. Four different forms are used to express the given algorithms: simultaneous recursion, recursion with respect to different variabl...
详细信息
parallelprogramming has to date remained inaccessible to the average scientific programmer. parallelprogramming languages are generally foreign to most scientific applications programmers who only speak Fortran. Aut...
详细信息
This paper deals with the problem of parallel construction of trees with optimal weighted path length. We study both the unordered case, known as the Huffman coding problem and the ordered case known as the optimal al...
详细信息
We present several algorithms for sorting efficiently with parallel two-level and multilevel memories. Our main result is an elegant, easy-to-implement, optimal, deterministic algorithm for external sorting with P dis...
详细信息
暂无评论