We propose a parallel GRASP heuristic with path-relinking for the 2-path network design problem. A parallel strategy for its implementation is described. Computational results illustrating the effectiveness of the new...
详细信息
the objective of the parallelism-independent (PI) scheduling is minimization of the completion time of a parallel application for any number of processing elements in the computing system. We propose several paralleli...
详细信息
the current paper presents a new algorithm and two architectures for the power-sum operation (AB2 + C) over GF(2m) using a standard basis. the proposed algorithm is based on the MSB-first scheme and the proposed archi...
详细信息
parallel multigrid methods are very prominent tools for solving huge systems of (non-)linear equations arising from the discretisation of PDEs, as for instance in Computational Fluid Dynamics (CFD). the superiority of...
详细信息
Texture is a fundamental feature for image analysis, classification, and segmentation. therefore, the reduction of the time needed for its description in a real application environment is an important objective. In th...
详细信息
We study two classical connectivity-preserving parallel shrinking algorithms proposed to recognize and label two-dimensional connected components of binary images. the algorithms we consider were developed by Beyer [R...
详细信息
We study two classical connectivity-preserving parallel shrinking algorithms proposed to recognize and label two-dimensional connected components of binary images. the algorithms we consider were developed by Beyer [Recognition of topological invariants by iterative arrays, Ph.D. thesis, MIT, 1969, p. 144] and Levialdi [Commun. ACM 15 (1) (1972) 7] independently for the purpose of shrinking 4-connected and 8-connected components of binary images in linear time, respectively. It is shown that those two independently developed algorithms are closely related and in a sense they are in a dual relation such that, for any initially given binary image and its inverted one, one algorithm produces, simultaneously, an image which is dual of the one produced by the other, step-by-step. (C) 2002 Elsevier Science B.V. All rights reserved.
We describe the construction of parallel iterative solvers for finite element approximations of the Navier-Stokes equations on unstructured grids using domain decomposition methods. the iterative method used is FGMRES...
详细信息
In this paper we study the use of idle cycles in. a network of desktop workstations under unfavourable conditions: we aim to use idle cycles to improve the responsiveness of interactive applications through parallelis...
详细信息
ISBN:
(纸本)3540440496
In this paper we study the use of idle cycles in. a network of desktop workstations under unfavourable conditions: we aim to use idle cycles to improve the responsiveness of interactive applications through parallelism. Unlike much prior work in the area, our focus is on response time, not throughput, and short jobs - of the order of a few seconds. We therefore assume a high level of primary activity by the desktop workstations' users, and aim to keep interference withtheir work within reasonable limits. We present a fault-tolerant, low-administration service for identifying idle machines, which can usually assign a group of processors to task in less than 200ms. Unusually, the system has no job queue: each job is started immediately withthe resources which are predicted to be available. Using trace-driven simulation we study allocation policy for a stream of parallel-jobs. Results show that even under heavy load it is possible to accommodate multiple concurrent guest jobs and obtain good speedup with very small disruption of host applications.
the SCOOPP (Scalable Object Oriented parallel Programming) system efficiently adapts, at run-time, an object oriented parallel application to any distributed memory system. It extracts as much parallelism as possible ...
详细信息
this paper presents a general methodology for the efficient parallelization of existing data cube construction algorithms. We describe two different partitioning strategies, one for top-down and one for bottom-up cube...
详细信息
ISBN:
(数字)9783540445036
ISBN:
(纸本)9783540414568
this paper presents a general methodology for the efficient parallelization of existing data cube construction algorithms. We describe two different partitioning strategies, one for top-down and one for bottom-up cube algorithms. Both partitioning strategies assign subcubes to individual processors in such a way that the loads assigned to the processors are balanced. Our methods reduce inter processor communication overhead by partitioning the load in advance instead of computing each individual group-by in parallel. Our partitioning strategies create a small number of coarse tasks. this allows for sharing of prefixes and sort orders between different group-by computations. Our methods enable code reuse by permitting the use of existing sequential (external memory) data cube algorithms for the subcube computations on each processor. this supports the transfer of optimized sequential data cube code to a parallel setting. the bottom-up partitioning strategy balances the number of single attribute external memory sorts made by each processor. the top-down strategy partitions a weighted tree in which weights reflect algorithm specific cost measures like estimated group-by sizes. Both partitioning approaches can be implemented on any shared disk type parallel machine composed of p processors connected via an interconnection fabric and with access to a shared parallel disk array. We have implemented our parallel top-down data cube construction method in C++ withthe MPI message passing library for communication and the LEDA library for the required graph algorithms. We tested our code on an eight processor cluster, using a variety of different data sets with a range of sizes, dimensions, density, and skew. Comparison tests were performed on a SunFire 6800. the tests show that our partitioning strategies generate a close to optimal load balance between processors. the actual run times observed show an optimal speedup of p.
暂无评论