In this paper we present an efficient algorithm for compile-time scheduling and clustering of parallel programs onto parallelprocessing systems with distributed memory, which is called The Dynamic Critical Path Sched...
详细信息
ISBN:
(纸本)0769523129
In this paper we present an efficient algorithm for compile-time scheduling and clustering of parallel programs onto parallelprocessing systems with distributed memory, which is called The Dynamic Critical Path Scheduling DCPS. The DCPS is superior to several other algorithms from the literature in terms of computational complexity, processors consumption and solution quality. DCPS has a time complexity of O(e + v log v), as opposed to DSC algorithm O((e + v) log v) which is the best known algorithm.
We consider a distributed asynchronous system where processes can only communicate by message passing and need a coherent view of the load (e.g., workload, memory) of others to take dynamic decisions (scheduling). We ...
详细信息
ISBN:
(纸本)0769523129
We consider a distributed asynchronous system where processes can only communicate by message passing and need a coherent view of the load (e.g., workload, memory) of others to take dynamic decisions (scheduling). We present several mechanisms to obtain a distributed view of such information, based either on maintaining that view or demand-driven with a snapshot algorithm. We perform an experimental study in the context of a real application, an asynchronous parallel solver for large sparse systems of linear equations.
While previous work has shown MPI to provide capabilities for system software, actual adoption has not widely occurred. We discuss process management shortcomings in MPI implementations and their impact on MPI usabili...
详细信息
ISBN:
(纸本)0769523129
While previous work has shown MPI to provide capabilities for system software, actual adoption has not widely occurred. We discuss process management shortcomings in MPI implementations and their impact on MPI usability for system software and managment tasks. We introduce MPISH, a parallel shell designed to address these issues.
We construct parallel algorithms with implementations to solve the clique problem in practice and research their computing time compared with sequential algorithms. The parallel algorithms are implemented in Java usin...
详细信息
ISBN:
(纸本)0769523129
We construct parallel algorithms with implementations to solve the clique problem in practice and research their computing time compared with sequential algorithms. The parallel algorithms are implemented in Java using threads. Best efficiency is achieved by solving the problem of task scheduling by using task pools.
We present a new approach for a function allocation manager supporting multiple reconfigurable devices. It is a rule-based approach considering resource usage and power consumption. An essential main component which i...
详细信息
ISBN:
(纸本)0769523129
We present a new approach for a function allocation manager supporting multiple reconfigurable devices. It is a rule-based approach considering resource usage and power consumption. An essential main component which is in the focus of this contribution is a retrieval-unit for function-requests. This unit adopts methodologies from the domain of knowledge-based systems by accounting quality-of-service related parameters from the calling application's function request.
We present fast and scalable parallel computations for a number of important and fundamental matrix problems on distributed memory systems (DMS). These problems include computing the powers, the inverse, the character...
详细信息
ISBN:
(纸本)0769523129
We present fast and scalable parallel computations for a number of important and fundamental matrix problems on distributed memory systems (DMS). These problems include computing the powers, the inverse, the characteristic polynomial, the determinant, the rank, the Krylov matrix, and an LU- and a QR-factorization of a matrix, and solving linear systems of equations. These parallel computations are based on efficient implementations of the fastest sequential matrix multiplication algorithm on DMS. We show that compared with the best known time complexities on PRAM, our parallel matrix computations achieve the same speeds on distributed memory parallel computers (DMPC), and have an extra polylog factor in the time complexities on DMS with hypercubic networks. Furthermore, our parallel matrix computations are fully scalable on DMPC and highly scalable over a wide range of system size on DMS with hypercubic networks. Such fast and scalable parallel matrix computations were not seen before on any distributed memory systems.
This paper describes the design of the OpenSTARS real-time analysis tool. The paper focuses on criteria for a good analysis tool including correctness, performance/scalability, flexibility, and extensibility. Several ...
详细信息
ISBN:
(纸本)0769523129
This paper describes the design of the OpenSTARS real-time analysis tool. The paper focuses on criteria for a good analysis tool including correctness, performance/scalability, flexibility, and extensibility. Several leading real-time analysis tools are surveyed and several problems with the tools under these criteria are identified. The paper then presents the basic components and operation of OpenSTARS and how its design addresses these problems.
In a few short years, computers capable of over one Petaflops performance will become a reality. The most likely approach for first successfully reaching this performance level will involve several thousands of parall...
详细信息
ISBN:
(纸本)0769523129
In a few short years, computers capable of over one Petaflops performance will become a reality. The most likely approach for first successfully reaching this performance level will involve several thousands of parallelprocessing elements. What are the key considerations for building such systems? What are the software requirements and demands? How will applications scale? How reliable are they likely to be? What will they be good for? We will address these questions and more based on early experience with the BlueGene system.
Scheduling is a fundamental issue in achieving high performance on metacomputers and computational grids. For the first time, the job scheduling problem for grid computing on metacomputers is studied as a combinatoria...
详细信息
ISBN:
(纸本)0769523129
Scheduling is a fundamental issue in achieving high performance on metacomputers and computational grids. For the first time, the job scheduling problem for grid computing on metacomputers is studied as a combinatorial optimization problem. It is proven that the list scheduling algorithm can achieve reasonable worst-case performance bound in grid environments supporting distributed supercomputing with large applications. It is also observed that communication heterogeneity does have significant impact on schedule lengths.
暂无评论