Nowadays genetic algorithms stand as a trend to solve NPcomplete and NP-hard problems. In this paper, we present a new hybrid metaheuristic which uses parallel Genetic algorithms and Scatter Search coupled with a deco...
详细信息
this paper presents an integrated design system called SPARCS (Synthesis and Partitioning for Adaptive Reconfigurable Computing Systems) for automatically partitioning and synthesizing designs for reconfigurable board...
详细信息
ISBN:
(纸本)3540643591
this paper presents an integrated design system called SPARCS (Synthesis and Partitioning for Adaptive Reconfigurable Computing Systems) for automatically partitioning and synthesizing designs for reconfigurable boards with multiple field-programmable devices (FPGAs). the SPARCS system accepts design specifications at the behavior level, in the form of task graphs. the system contains a temporal partitioning tool to temporally divide and schedule the tasks on the reconfigurable architecture, a spatial partitioning tool to map the tasks to individual FPGAs, and a high-level synthesis tool to synthesize efficient register-transfer level designs for each set of tasks destined to be downloaded on each FPGA. Commercial logic and layout synthesis tools are used to complete logic synthesis, placement, and routing for each FPGA design segment. A distinguishing feature of the SPARCS system is the tight integration of the partitioning and synthesis tools to accurately predict and control design performance and resource utilizations. this paper presents an overview of SPARCS and the various algorithms used in the system, along with a brief description of how a JPEG-like image compression algorithm is mapped to a multi-FPGA board using SPARCS.
the nano-threads programming model was proposed to effectively integrate multiprogramming on shared-memory multiprocessors, withthe exploitation of fine-grain parallelism from standard applications. A prerequisite fo...
详细信息
In this paper we are concerned with developing a practical parallel algorithm for Delaunay triangulation that works well on general distributions, particularly those that arise in Scientific Computation. Although ther...
详细信息
ISBN:
(纸本)9780897918046
In this paper we are concerned with developing a practical parallel algorithm for Delaunay triangulation that works well on general distributions, particularly those that arise in Scientific Computation. Although there have been many theoretical algorithms for the problem, and some implementations based on bucketing that work well for uniform distributions, there has been little work on implementations for general distributions. We use the well known reduction of 2D Delaunay triangulation to 3D convex hull of points on a sphere or paraboloid. A variant of the Edelsbrunner and Shi 3D convex hull is used, but for the special case when the point set lies on either a sphere or a paraboloid. Our variant greatly reduces the constant costs from the 3D convex hull algorithm and seems to be a more promising for a practical implementation than other parallel approaches. We have run experiments on the algorithm using a variety of distributions that are motivated by various problems that use Delaunay triangulations. Our experiments show that for these distributions we are within a factor of approximately two in work from the best sequential algorithm.
Given a set S of n points in the plane, an Ε-strongly convex δ-hull of S is defined as a convex polygon P withthe vertices taken from S such that no point of S lies farther than δ outside P and such that even if t...
详细信息
ISBN:
(纸本)9780897918046
Given a set S of n points in the plane, an Ε-strongly convex δ-hull of S is defined as a convex polygon P withthe vertices taken from S such that no point of S lies farther than δ outside P and such that even if the vertices of P are perturbed by as much as Ε, P remains convex (e.g., the convex hull of S is a 0-strongly convex 0-hull of S). this paper presents the first parallel robust method for the generalized convex hull problem. We show that an Ε-strongly convex O(Ε + β)-hull of S can be constructed in O(log3 n) time using n processors with imprecise computations, where β is the error unit of the primitive operations. the result implies an improved sequential algorithm for the problem. Our method consists of three algorithms: finding an approximate common tangent of two sets of n points each in O(log2 n) time using n processors, computing a convex O(Ε + β)-hull of n points in O(log3 n) time using n processors, and constructing an Ε-strongly convex O(Ε + β)-hull of a convex n-gon in O(log n) time with n processors using imprecise computations. We use the CREW PRAM computational model in the paper.
In this paper we focus on the problem of designing very fast parallelalgorithms for the convex hull problem in two and three dimensions in the arbitrary CRCW model whose running times are output-size sensitive. We pr...
详细信息
In this paper we focus on the problem of designing very fast parallelalgorithms for the convex hull problem in two and three dimensions in the arbitrary CRCW model whose running times are output-size sensitive. We present a fast randomized algorithm for planar hulls that runs in expected time O(log H · log log n) and does optimal O(n log H) work where n and H are the input and output sizes respectively. For log H = Ω(log log n), we can achieve the optimal running time of O(log H) for planar hulls while simultaneously keeping the work optimal. In three dimensions, our algorithm runs in expected time O(log log2 n·log H) with optimal O(n log H) work for all H. Hence, for O(logO(1) n) size outputs, our algorithms in two and three dimensions achieve poly(log log n) running time and optimal O(n log log n) work. the previously known output-sensitive work-optimal algorithms for convex hulls have running times Ω(log n) (expected) and Ω(log3 n) in two and three dimensions respectively. Our algorithms assume no input distribution and the running times hold with high probability. We also describe a very simple O(log n log H) time optimal deterministic algorithm for planar hulls which is an improvement for small outputs. For larger output-sizes, a running time of O(log n log log n)) can be achieved.
We present algorithms for sorting and routing on two-dimensional mesh-connected parallelarchitecturesthat are optimal on average. If one processor has many packets then we asymptotically halve the up to now best run...
详细信息
the design and use of scalable parallel computers is discussed. Much of the knowledge gained in the study of distributd systems is relevant, although it needs to be used in new manners. the author argues that the chal...
详细信息
ISBN:
(纸本)0897916131
the design and use of scalable parallel computers is discussed. Much of the knowledge gained in the study of distributd systems is relevant, although it needs to be used in new manners. the author argues that the challenge is to achieve maximum reuse of existing technologies, microprocessor architectures, I/O systems, programming languages, operating systems, application codes, without compromising too much parallel performance and scalability. Such skillful compromises are essential to the success of the parallel processing industry, and pose even more intellectual challenges than the 'start from scratch' approach.
the symposium materials contain 24 papers. the topics covered include local area networks as distributed systems;operating systems research;scalable parallel computing;atomic snapshots;radio networks;fast deflection r...
详细信息
ISBN:
(纸本)0897916131
the symposium materials contain 24 papers. the topics covered include local area networks as distributed systems;operating systems research;scalable parallel computing;atomic snapshots;radio networks;fast deflection routing for packets and worms;parallel computing models;wait-free clock synchronization;optimal clock synchronization;partially synchronized clocks;resource bounds and combinations of consensus objects;scalable synchronization with minimal hardware support;adaptive solutions to the mutual exclusion problem;distributed fingerprints and secure information dispersal;replicated files;space complexity of randomized synchronization;lower bound on wait-free counting;and knowledge-oriented programming.
In this paper we study the practical viability of the BSP model of parallel computation as proposed by Valiant. this model is intended for simulating the often considered PRAM model on more realistic parallel computer...
详细信息
ISBN:
(纸本)0897916131
In this paper we study the practical viability of the BSP model of parallel computation as proposed by Valiant. this model is intended for simulating the often considered PRAM model on more realistic parallel computers with a fixed interconnection network. One of the main attributes of the BSP model is randomized routing. From experimentation on an existing parallel architecture, analytic models are derived which characterize the efficiency of this routing scheme. this characterization leads to the identification of the bottlenecks involved in building a parallel architecture in which the BSP model can efficiently be embedded.
暂无评论