A major step towards practical use of parallel computers is the integration of cost into transformational or derivational software development methods. The difficulties with doing this come from the wide variety of pa...
详细信息
A major step towards practical use of parallel computers is the integration of cost into transformational or derivational software development methods. The difficulties with doing this come from the wide variety of parallel architectures possible and the effects of memory access and congestion phenomena. This paper presents a model of costs for uniform architectures that is compatible with refinement-based methods of development, that is much simpler than those previously suggested, but which accurately assesses the costs of an implemented computation. Decisions about architecture type and machine size can be made at any stage in the development, even at the end.
We examine a very simple asynchronous model of parallelcomputation that assumes the time to compute a task is random, following some probability distribution. The goal of this model is to capture the effects of unpre...
详细信息
We examine a very simple asynchronous model of parallelcomputation that assumes the time to compute a task is random, following some probability distribution. The goal of this model is to capture the effects of unpredictable delays on processors, due to communication delays or cache misses, for example. Using techniques from queueing theory and occupancy problems, we use this model to analyze two parallel dynamic programming algorithms. We show that this model is simple to analyze and correctly predicts which algorithm will perform better in practice. The algorithms we consider are a pipeline algorithm, where each processor i computes in order the entries of rows i, i + p, and so on, where p is the number of processors;and a diagonal algorithm, where entries along each diagonal extending from the left to the top of the table are computed in turn. It is likely that the techniques used here can be useful in the analysis of other algorithms that use barriers or pipelining techniques.
In this paper, we study a class of VLSI organizations with optical interconnects for fast solutions to several image processing tasks. The organization and operation of these architectures are based on a generic model...
详细信息
In this paper, we study a class of VLSI organizations with optical interconnects for fast solutions to several image processing tasks. The organization and operation of these architectures are based on a generic model called OMC, which is proposed to understand the computational limits in using free space optics in VLSI parallel processing systems. The relationships between OMC and shared memory models are discussed in this paper. Also, three physical implementations of OMC are presented. Using OMC, we present several parallel algorithms for fine grain image computing. We categorize our results in the following order. First, we present a set of processor efficient optimal O(log N) algorithms and a set of constant time algorithms for finding geometric properties of digitized images. Finally, we focus on special purpose designs tailored to meet both the computation and communication needs of problems such as those involving irregular sparse matrices.
The problem of computing optimal solutions to Constraint Satisfaction Problem (CSP) instances parameterized by the size of the objective function is considered, and fixed parameter polynomial-time algorithms are propo...
详细信息
The problem of computing optimal solutions to Constraint Satisfaction Problem (CSP) instances parameterized by the size of the objective function is considered, and fixed parameter polynomial-time algorithms are proposed within the structure-based framework of tree projections. The algorithms compute the desired optimal (or best k) solutions whenever there exists a tree projection for the given instance;otherwise, the algorithms report that there is no tree-projection. For the case where a tree projection is available, parallel algorithms are also proposed and analyzed. Structural decomposition methods based on acyclic, bounded treewidth, and bounded generalized hypertree-width hypergraphs, extensively considered in the CSP setting, as well as in conjunctive database query evaluation and optimization, are all covered as special cases of the tree projection framework. (C) 2017 Elsevier Inc. All rights reserved.
The Broadcasting with Selective Reduction (BSR) model of parallelcomputation is applied to the problem of finding Maximal Sum Subsegments in one-dimensional sequences of n elements, and to the related problem of Maxi...
详细信息
The Broadcasting with Selective Reduction (BSR) model of parallelcomputation is applied to the problem of finding Maximal Sum Subsegments in one-dimensional sequences of n elements, and to the related problem of Maximal Sum Subrectangles in two-dimensional arrays. The one-dimensional case is shown to be solvable in constant time using O(n) BSR processors. The two-dimensional case is solved in constant time with O(n 3 ) BSR processors (for an n×n array).
暂无评论