In this paper MPI is used on PC Cluster to compute all the eigenvalues of Hermitian Toeplitz Matrices. the parallelalgorithms presented were implemented in C++ with MPI functions inserted and run on a cluster of Leno...
详细信息
ISBN:
(纸本)9783642131189
In this paper MPI is used on PC Cluster to compute all the eigenvalues of Hermitian Toeplitz Matrices. the parallelalgorithms presented were implemented in C++ with MPI functions inserted and run on a cluster of Lenovo thinkCentre machines running RedHat Linux. the two methods, MAHT-P one embarrassingly parallel and the other MPEAHT using master/ slave scheme are compared for performance and results presented. It is seen that computation time is reduced and speedup factor increases withthe number of computers used for the two parallel schemes presented. Load balancing becomes an issue as number of computers in a cluster are increased. A solution is provided to overcome such a case.
the aim of this paper is to show that a kind of boundary value problem for second-order ordinary differential equations which reduces to the problem of solving tridiagonal system of linear equations with almost Toepli...
详细信息
ISBN:
(纸本)9783642143892
the aim of this paper is to show that a kind of boundary value problem for second-order ordinary differential equations which reduces to the problem of solving tridiagonal system of linear equations with almost Toeplitz structure can be efficiently solved on modern multicore architectures using a parallel tiled algorithm based on the divide and conquer approach for solving linear recurrence systems with constant coefficients and novel data formats for dense matrices.
this work presents an improved genetic algorithm (IGA) for minimizing periodic preventive maintenance costs in series-parallel systems. the intrinsic properties of a repairable system, including the structure of relia...
详细信息
ISBN:
(纸本)9783642131189
this work presents an improved genetic algorithm (IGA) for minimizing periodic preventive maintenance costs in series-parallel systems. the intrinsic properties of a repairable system, including the structure of reliability block diagrams and component maintenance priorities are considered by the proposed IGA. the proposed component importance measure considers these properties, identifies key components, and determines their maintenance priorities. the optimal maintenance periods of these important components are then determined to minimize total maintenance cost given the allowable worst reliability of a repairable system. An adjustment mechanism is established to solve the problem of chromosomes falling into infeasible areas. A response surface methodology is further used to systematically determine crossover probability and mutation probability in the GA instead of using the conventional trial-and-error process. A case study demonstrates the effectiveness and practicality of the proposed IGA for optimizing the periodic preventive maintenance model in series-parallel systems.
Multishift QR, algorithms are efficient for solving the symmetric tridiagonal eigenvalue problem on a parallel computer. In this paper, we focus on three variants of the multishift QR. algorithm, namely, the conventio...
详细信息
ISBN:
(纸本)9783642131356
Multishift QR, algorithms are efficient for solving the symmetric tridiagonal eigenvalue problem on a parallel computer. In this paper, we focus on three variants of the multishift QR. algorithm, namely, the conventional multishift QR algorithm, the deferred shift QR, algorithm and the fully pipelined multishift QR, algorithm, and construct performance models for them. Our models are designed for shared-memory parallel machines, and given the basic performance characteristics of the target;machine and the problem size, predict the execution time of these algorithms. Experimental results show that our models can predict the relative performance of these algorithms to the accuracy of 10% in many cases. thus our models are useful for choosing the best algorithm to solve a given problem in a specified computational enviromnent, as well as for finding the best value of the performance parameters.
Given two sorted arrays A = (a(1), a(2), ..., a(n)) and B = (b(1), b(2), ..., b(n)) of records such that (1) the n records are sorted according to one field which is called the key, and (2) the values of the keys are ...
详细信息
ISBN:
(纸本)9783642131356
Given two sorted arrays A = (a(1), a(2), ..., a(n)) and B = (b(1), b(2), ..., b(n)) of records such that (1) the n records are sorted according to one field which is called the key, and (2) the values of the keys are serial numbers. Merging data records has many applications in computer science especially in database. We develop an algorithm that runs in O(log n) time on EREW PRAM to merge two sorted arrays of records using n/log n processors even the keys of the data records are repeated. the algorithm is cost-optimal, deterministic, stable and uses linear number of space.
Withthe increasing "data deluge" scientists face today, the analysis and processing of large datasets of structured data is a daring task. Among such data, large graphs are gaining particular importance wit...
详细信息
ISBN:
(纸本)9781450304535
Withthe increasing "data deluge" scientists face today, the analysis and processing of large datasets of structured data is a daring task. Among such data, large graphs are gaining particular importance withthe growing interest on social networks and other complex networks. Given the dimensions considered, parallelprocessing is essential. However, users are generally not experts in writing parallel code to handle such structures. In this work we present Rendero, a middleware that makes it possible to easily describe graph algorithms in a form adequate for parallel execution. the system is based on the Bulk-Synchronous programming model and offers a vertex-based abstraction. Our current implementation offers good speed-up and scale-up results for large graphs ranging from tens of thousands to millions of vertices and edges in some cases. Copyright 2010 ACM.
Robust and efficient parallel numerical algorithms and their implementation in easy-to-use portable software components are crucial for computational science and engineering applications. they are strongly influenced ...
详细信息
In this paper, we present a general architecture of hybrid prefix/carry-select adder. Based on this architecture, we formalize the hybrid adder's algorithm using the first-order recursive equations and develop a p...
详细信息
ISBN:
(纸本)9783642131189
In this paper, we present a general architecture of hybrid prefix/carry-select adder. Based on this architecture, we formalize the hybrid adder's algorithm using the first-order recursive equations and develop a proof framework to prove its correctness. Since several previous adders in the literature are special cases of this general architecture, our methodology can be used to prove the correctness of different hybrid prefix/carry-select adders. the formal proof for a special hybrid prefix/carry-select adder shows the effectiveness of the algebraic structures built in this paper.
A parallel scheme for distributed memory hierarchy system is presented to solve the large-scale three-dimensional heat equation. Since managing interprocess communications and coordination is the main difficulty with ...
详细信息
ISBN:
(纸本)9783642131356
A parallel scheme for distributed memory hierarchy system is presented to solve the large-scale three-dimensional heat equation. Since managing interprocess communications and coordination is the main difficulty withthe system, the local physics/global algebraic object paradigm is introduced. Domain decomposition method is used to partition the modeling area, as well as the intensive computational effort and large memory requirement. Efficient storage and assembly of sparse matrix and parallel iterative solution of linear system are considered and developed. the efficiency and scalability of the parallel program are demonstrated by completing two experiments on Linux cluster, in which different preconditioning methods are tested and analyzed. And the results demonstrate this method could achieve desirable parallel performance.
In this paper we report on the recent progress in computing bivariate polynomial resultants on Graphics processing Units (GPU). Given two polynomials in Z[x, y], our algorithm first maps the polynomials to a prime fie...
详细信息
ISBN:
(纸本)9783642131189
In this paper we report on the recent progress in computing bivariate polynomial resultants on Graphics processing Units (GPU). Given two polynomials in Z[x, y], our algorithm first maps the polynomials to a prime field. then, each modular image is processed individually. the GPU evaluates the polynomials at a number of points and computes univariate modular resultants in parallel. the remaining "combine" stage of the algorithm is executed sequentially on the host machine. Porting this stage to the graphics hardware is an object of ongoing research. Our algorithm is based on an efficient modular arithmetic from [1]. Withthe theory of displacement structure we have been able to parallelize the resultant algorithm up to a very fine scale suitable for realization on the GPU. Our benchmarks show a substantial speed-up over a host-based resultant algorithm [2] from CGAL (***).
暂无评论