The usual concern when scaling an algorithm on a parallel model ofcomputation is preserving efficiency while increasing or decreasing the number of processors. Manyalgorithms for reconfigurable models, however, attain...
详细信息
The usual concern when scaling an algorithm on a parallel model ofcomputation is preserving efficiency while increasing or decreasing the number of processors. Manyalgorithms for reconfigurable models, however, attain constant time at the expense of an inefficientalgorithm. For these algorithms, scaling down the number of processors while preservinginefficiency is no benefit once constant time execution is lost. In fact, one can often acceleratethe efficiency of these algorithms while reducing the number of processors. To quantify thisimprovement in efficiency, this paper introduces the measure of degree of scalability to complementthe insight obtained from efficiency for such algorithms. Demonstrating the utility of this measure,we present new reconfigurable mesh (R-Mesh) algorithms for multiple addition and matrix-vectormultiplication, improving both the number of processors and the degree of scalability compared toprevious algorithms. We also extend these results to floating point number operands, which havepreviously received little attention on the R-Mesh.
The list-ranking problem is considered for parallel computers which communicate through an interconnection network. Each PU holds k nodes of a set of linked lists. A no-vel randomized algorithm gives a considerable im...
详细信息
The list-ranking problem is considered for parallel computers which communicate through an interconnection network. Each PU holds k nodes of a set of linked lists. A no-vel randomized algorithm gives a considerable improvement over earlier ones: for a large class of networks and sufficiently large k, it takes only twice the number of steps required by a k-k routing. For hypercubes the condition is k = omega(log(2) N). Even better results are achieved for d-dimensional meshes: we show that the ranking time exceeds the routing time only by lower-order terms for all k = omega(d(2)). We also show that list-ranking requires at least the time required for k-k routing. Thus, the results are within a factor two from optimal, those for meshes even match the lower bound up to lower-order terms. (C) 2002 Elsevier Science (USA). All rights reserved.
This paper presents a fault-tolerant technique based on the modulus replication residue number system. (MRRNS) which allows for modular arithmetic computations over identical channels. In this system, fault tolerance ...
详细信息
This paper presents a fault-tolerant technique based on the modulus replication residue number system. (MRRNS) which allows for modular arithmetic computations over identical channels. In this system, fault tolerance is provided by adding extra computational channels that can be used to redundantly compute the mapped output. An algebraic technique is used to determine the error position in the mapped outputs and provide corrections. We also show that by taking advantage of some elementary polynomial properties we obtain the same level of fault tolerance with about a 30% decrease in the number of channels. This new system is referred to as.. the symmetric MRRNS (SMRRNS).
Transactions within a mobile database management system face many restrictions. These cannot afford unlimited delays or participate in multiple retry attempts for execution. The proposed embedded concurrency control (...
详细信息
Transactions within a mobile database management system face many restrictions. These cannot afford unlimited delays or participate in multiple retry attempts for execution. The proposed embedded concurrency control (ECC techniques provide support on three counts, namely-to enhance concurrency, to overcome problems due to heterogeneity, and to allocate priority to transactions that originate from mobile hosts. These proposed ECC techniques can be used to enhance the server capabilities within a mobile database management system. Adoption of the techniques can be beneficial in general, and for other special cases of transaction management in distributed real-time database management systems. The proposed model can be applied to other similar problems related to synchronization, such as the generation of a backup copy of an operational database system. (C) 2003 Elsevier Science B.V. All rights reserved.
Although evolutionary algorithm is a powerful optimization tool, its computation cost involved in terms of, time and hardware resources increases as the size or complexity of the problem increases. One promising appro...
详细信息
Although evolutionary algorithm is a powerful optimization tool, its computation cost involved in terms of, time and hardware resources increases as the size or complexity of the problem increases. One promising approach to overcome this limitation is to exploit the inherent parallelism of evolutionary algorithms by creating an infrastructure necessary to support distributed evolutionary computing using existing Internet, and hardware resources. This paper presents a Java-based distributed evolutionary computing software (Paladin-DEC), which enhances the concurrent processing and performance of evolutionary algorithms by allowing inter-communications of subpopulations among various computers over the Internet. Such a distributed system enables individuals to migrate among multiple subpopulations according to some patterns to induce diversity of elite individuals periodically, in a way that simulates the species evolve in natural environment. The Paladin-DEC software is capable of keeping data integrity throughout the computation, and is incorporated with the features of robustness, security, fault tolerance, and work balancing. The effectiveness and advantages of the Paladin-DEC are illustrated upon two case studies of drug scheduling in cancer chemotherapy and searching probe sets of yeast genome.
In this work we describe and analyze algorithms for advanced video coding on distributed memory MIMD architectures. In particular, we consider a wavelet packet based codec using the concept of zerotree encoding. The m...
详细信息
In this work we describe and analyze algorithms for advanced video coding on distributed memory MIMD architectures. In particular, we consider a wavelet packet based codec using the concept of zerotree encoding. The main contribution of this work is the design of a parallel motion-compensated video coder composed of a wavelet packet decomposition in conjunction with the best basis algorithm followed by zerotree coding. Whereas two sensible parallelization techniques can be employed for the wavelet packet decomposition (subband based partitioning and stripe partitioning), the zerotree coding and motion compensation stages only allow one reasonable parallelization method (stripe partitioning). We investigate the advantages and drawbacks of the resulting different overall data distribution strategies and show experimental results obtained on a Siemens hpcLine cluster and a Cray T3E. (C) 2003 Elsevier B.V. All rights reserved.
Based on Luo's parallel algorithm [4] for certain Toeplitz cyclic tridiagonal systems on distributed-memory multicomputer, we present an improved algorithm. Its communication mechanism is simple and redundant comp...
详细信息
ISBN:
(纸本)3540200541
Based on Luo's parallel algorithm [4] for certain Toeplitz cyclic tridiagonal systems on distributed-memory multicomputer, we present an improved algorithm. Its communication mechanism is simple and redundant computing is small for solving massively systems. The numerical experiments show that the parallel efficiency of the improved algorithm is higher than Luo's algorithm [4].
We present some remarks on the numerical evaluation of recurrence relations. Rounding error bounds are presented of the numerical scheme and some numerical examples are given, in particular, we analyse conversion recu...
详细信息
We present some remarks on the numerical evaluation of recurrence relations. Rounding error bounds are presented of the numerical scheme and some numerical examples are given, in particular, we analyse conversion recurrences from different families of orthogonal polynomials, the limit case of Jacobi-Sobolev polynomials, random recurrences and perturbed Gegenbauer polynomials. In all these examples the theoretical bounds give sharp relative rounding error estimations. The parallel evaluation of recurrences are also considered and numerical tests on a Cray T3D are presented. (C) 2002 Elsevier Science B.V. All rights reserved.
The "fractional tree" algorithm for broadcasting and reduction is introduced. Its communication pattern interpolates between two well known patterns-sequential pipeline and pipelined binary tree. The speedup...
详细信息
The "fractional tree" algorithm for broadcasting and reduction is introduced. Its communication pattern interpolates between two well known patterns-sequential pipeline and pipelined binary tree. The speedup over the best of these simple methods can approach two for large systems and messages of intermediate size. For networks which are not very densely connected the new algorithm seems to be the best known method for the important case that each processor has only a single (possibly bidirectional) channel into the communication network. (C) 2002 Elsevier Science B.V. All rights reserved.
Recent advances in volume scanning techniques have made the task of acquiring volume data of 3-D objects easier and more accurate. Since the quantity of such acquired data is generally very large, a volume model capab...
详细信息
暂无评论