Given two strings X and Y of lengths m and n, respectively, the all-substrings longest common subsequence (ALCS) problem obtains the lengths of the subsequences common to X and any substring of Y. The sequential algor...
详细信息
A codesign is the simultaneous design of hardware and software subsystems. In our codesign, we exploit the highly parallel nature of matrix multiplication which cannot be exploited in our purely software implementatio...
详细信息
ISBN:
(纸本)1892512416
A codesign is the simultaneous design of hardware and software subsystems. In our codesign, we exploit the highly parallel nature of matrix multiplication which cannot be exploited in our purely software implementation. The hardware part of our codesign system is responsible for performing the arithmetic operations. This includes the matrix multiplier, which performs concurrent multiplication and addition operations of matrix multiplication. Our matrix multiplier is modeled in VHDL and runs on an ARC-PCI FPGA board. The purpose of the software part of our codesign system is to provide I/O to the hardware. This part is implemented on a PC with a C program and a device driver to communicate with the board. We present the performance comparison of our codesign and purely software implementation, as well as the performance comparison of existing parallel implementations. Examples of applications that require large, fast matrix multiplication are bipartite graph determination (non-existence of odd cycles), Economics (Leontief input-output model), power-invariant transformations (power systems), Cryptography, and genetics modeling (Markov chains).
Deadlock prevention for routing messages has a central role in communication networks, since it directly influences the correctness of parallel and distributed systems. In this paper, we extend some of the computation...
详细信息
A self-organized approach to manage a distributed proxy system called Adaptive Distributed Caching (ADC) has been proposed previously [8]. We model each proxy as an autonomous agent that is equipped to decide how to d...
详细信息
We present a family of periodic comparator networks that transform the input so that it consists of a few sorted subsequences. The depths of the networks range from 4 to 2 log n while the number of sorted subsequences...
详细信息
ISBN:
(纸本)3540405437
We present a family of periodic comparator networks that transform the input so that it consists of a few sorted subsequences. The depths of the networks range from 4 to 2 log n while the number of sorted subsequences ranges from 2 log n to 2. They work in time c log(2) n + O(log n) with 4 less than or equal to c less than or equal to 12, and the remaining constants are also suitable for practical applications. So far, known periodic sorting networks of a constant depth that run in time O(log(2) n) (a periodic version of AKS network [7]) are impractical because of complex structure and very large constant factor hidden by big "Oh".
The studied large-scale linear problems arise from Crouzeix-Raviart non-conforming FEM approximation of second order elliptic boundary value problems. A two-level preconditioner for the case of coefficient anisotropy ...
详细信息
ISBN:
(纸本)3540210903
The studied large-scale linear problems arise from Crouzeix-Raviart non-conforming FEM approximation of second order elliptic boundary value problems. A two-level preconditioner for the case of coefficient anisotropy is analyzed. A special attention is given to the potential of the method for a parallel implementation.
The accuracies of three equations to determine the size of populations for serial and parallel genetic algorithms are evaluated when applied to a parallel genetic algorithm that schedules tasks on a cluster of compute...
详细信息
The accuracies of three equations to determine the size of populations for serial and parallel genetic algorithms are evaluated when applied to a parallel genetic algorithm that schedules tasks on a cluster of computers connected via shared bus. This NP-complete problem is representative of a variety of optimisation problems for which genetic algorithms (GAs) have been shown to effectively approximate the optimal solution. However, empirical determination of parameters needed by both serial and parallel GAs is time-consuming, often impractically so in production environments. The ability to predetermine parameter values mathematically eliminates this difficulty. The parameter that exerts the most influence over the solution quality of a parallel genetic algorithm is the population size of the demes. Comparisons here show that the most accurate equation for the scheduling application is Cantú-Paz serial population sizing calculation based on the gambler's ruin model [1]. The study presented below is part of an ongoing analysis of the effectiveness of parallel genetic algorithm parameter value computations based on schema theory. The study demonstrates that the correct deme size can be predetermined quantitatively for the scheduling problem presented here, and suggests that this may also be true for similar optimisation problems.
We present efficient (parallel) algorithms for two hierarchical clustering heuristics. We point out that these heuristics can also be applied to solving some algorithmic problems in graphs, including split decompositi...
详细信息
We present efficient (parallel) algorithms for two hierarchical clustering heuristics. We point out that these heuristics can also be applied to solving some algorithmic problems in graphs, including split decomposition. We show that efficient parallel split decomposition induces an efficient parallel parity graph recognition algorithm. This is a consequence of the result of S. Cicerone and D. Di Stefano [7] that parity graphs are exactly those graphs that can be split decomposed into cliques and bipartite graphs, (C) 2000 Academic Press.
New parallel "ring" algorithm for solution of a spatially one dimensional initial-boundary-value problem (IBVP) for a parabolic equation using an explicit difference method in this article is suggested. The ...
详细信息
New parallel "ring" algorithm for solution of a spatially one dimensional initial-boundary-value problem (IBVP) for a parabolic equation using an explicit difference method in this article is suggested. The parallel algorithm has been verified by implementation on a workstation-cluster running under parallel virtual machine (PVM). A speed-up function is determined as a ratio of time needed for realization of the algorithm in sequential case to time in parallel. Theoretical estimates of the speed-up function show the significant speed-up of the parallel algorithm in comparison with the serial one.
In this paper we present a parallel algorithm, implemented using MPICH, for isosurface extraction from volumetric data sets. The main contribution of this paper is in the analysis and performance improvements of the d...
详细信息
暂无评论