The memory usage of sparse direct solvers can be the bottleneck to solve large-scale problems. This paper describes dynamic scheduling strategies that aim at reducing the memory usage of a parallel direct solver. Comb...
详细信息
ISBN:
(纸本)0769521320
The memory usage of sparse direct solvers can be the bottleneck to solve large-scale problems. This paper describes dynamic scheduling strategies that aim at reducing the memory usage of a parallel direct solver. Combined to static modifications of the tasks dependency graph, experiments show that such techniques have a good potential to improve the memory usage of a parallel multifrontal solver, MUMPS.
Automatic data distribution for composite grid applications requires that meshes be mapped to a parallel machine in a manner that balances the load while minimizing communication. To provide beginning support for mapp...
详细信息
ISBN:
(纸本)0769521320
Automatic data distribution for composite grid applications requires that meshes be mapped to a parallel machine in a manner that balances the load while minimizing communication. To provide beginning support for mapping to computational grids, the Automatic Data Distribution Toolkit (ADDT) supports heterogeneous environments with varying communication and computational capabilities. This paper describes the applications supported by ADDT, the mapping techniques used, the web interface for the toolkit, as well as planned future developments1.
In here we consider the problem of parallel execution of Join operation by a J2EE cluster. J2EE clusters are intended for coarse-grain distributedprocessing of multiple queries/business transactions over the Web. Thu...
详细信息
ISBN:
(纸本)0769522106
In here we consider the problem of parallel execution of Join operation by a J2EE cluster. J2EE clusters are intended for coarse-grain distributedprocessing of multiple queries/business transactions over the Web. Thus, the possiblity of using it J2EE cluster for fine-grain parallel computations (parallel Joins in our case) is intriguing and of practical interest. We have developed a new variant of the SFR algorithm for parallel computation of Cartesian Product in Join operations and proved its optimality in terms of communication/execution-time tradeoffs via a simple lower bound. Our experimental results show that despite the fact that J2EE is considered to be a platform that uses a complex interfaces and software entities, such as various types of Java beans, J2EE clusters can be efficiently used to execute Join operation in parallel.
The scheduling problem deals with the optimal assignment of a set of tasks to processing elements in a distributed system such that the total execution time is minimized. One approach for solving the scheduling proble...
详细信息
ISBN:
(纸本)0769521320
The scheduling problem deals with the optimal assignment of a set of tasks to processing elements in a distributed system such that the total execution time is minimized. One approach for solving the scheduling problem is task clustering. This involves assigning tasks to clusters where each cluster is run on a single processor. This paper aims to show the feasibility of using Genetic Algorithms for task clustering to solve the scheduling problem. Genetic Algorithms are robust optimization and search techniques that are used in this work to solve the task-clustering problem. The proposed approach shows great promise to solve the clustering problem for a wide range of clustering instances.
A new ZKp identity protocol is proposed in this paper. It is more appropriate than the traditional identity protocol in distributed environment without an identical trusted third party. The security of this protocol r...
详细信息
ISBN:
(纸本)3540241280
A new ZKp identity protocol is proposed in this paper. It is more appropriate than the traditional identity protocol in distributed environment without an identical trusted third party. The security of this protocol relies on the discrete logarithm problem on conic over finite fields. It can be designed and implemented easier than those on elliptic curve. A simple solution is proposed to prevent a potential leak of our protocol.
We study two parallel computing models - the Bulk Synchronous parallel (BSP) and the Queued Shared Memory (QSM) - as alternatives to the PRAM model to provide more accurate performance predictions and analyses, and co...
详细信息
ISBN:
(纸本)0769521320
We study two parallel computing models - the Bulk Synchronous parallel (BSP) and the Queued Shared Memory (QSM) - as alternatives to the PRAM model to provide more accurate performance predictions and analyses, and compares the two models in detail. As a case study, we consider a simple hashing problem, design the two versions -the message passing version and the shared memory version - of the algorithm, and compare their run time analytically. The message passing version of the algorithm is implemented and the experiments are performed to display the accuracy and the limitations of the predicted performance analysis.
A symmetric algorithm is proposed for detecting distributed termination in a dynamic system with asynchronous communication networks. Correctness of the algorithm is proven. In the system, active processes may create ...
详细信息
ISBN:
(纸本)0769521320
A symmetric algorithm is proposed for detecting distributed termination in a dynamic system with asynchronous communication networks. Correctness of the algorithm is proven. In the system, active processes may create new processes or accept outside processes to join the basic computation. No processes can be destroyed or leave the system until the computation terminates. The network model exploited in the algorithm is a combination of a logical ring and computation trees. It is more general and especially suitable for the applications on Internet networks. The algorithm is more efficient than those in previous works in terms of control messages used in the detection protocol.
This paper presents a self-adapting distributed memory package for computing the Walsh-Hadamard transform (WHT), a prototypical fast signal transform, similar to the fast Fourier transform. A family of distributed mem...
详细信息
ISBN:
(纸本)0769521320
This paper presents a self-adapting distributed memory package for computing the Walsh-Hadamard transform (WHT), a prototypical fast signal transform, similar to the fast Fourier transform. A family of distributed memory algorithms are derived from different factorizations of the WHT matrix. Different factorizations correspond to different data distributions and communication patterns. Thus, searching over the space of factorizations leads to the best data distribution and communication pattern for a given platform. The distributed memory WHT package provides a framework for converting factorizations of the WHT matrix into MPI programs and exploring their performance by searching the space of factorizations.
Low-level language constructs used for expressing explicit communication, concurrency, synchronization, and parallelism in systems make the systems difficult to maintain. For example, many programming languages allow ...
详细信息
ISBN:
(纸本)0769520944
Low-level language constructs used for expressing explicit communication, concurrency, synchronization, and parallelism in systems make the systems difficult to maintain. For example, many programming languages allow programmers to create parallel processes by using the fork/join statement and provide locking mechanisms to synchronize the resulting parallel computation. However, since fork/join may appear anywhere in a program, program making unstructured use of the language constructs may be difficult to understand and debug. We are presenting a middleware-based approach to distributed coordinated parallel programming. A familiar programming model will be provided to support implicit communication, concurrency, synchronization, and parallelism in systems through an implicit coordination-oriented approach. In other words, programmers do not have to explicitly express communication, concurrency, synchronization, and parallelism when they are developing distributed systems for parallelprocessing. In addition, a 4-layered interconnection architecture will be implemented to support the programming model in an integrated manner. The implicit coordination-oriented approach to supporting parallel programming provides a number of benefits. Without inserting the low-level language constructs in an unstructured manner in programs makes the programs modular. Modularity improves the maintainability of the programs. Our approach supports the portability of programs by allowing the programs in different programming languages to be executed in any general programming environment without modifications.
The article is devoted to the concept of the stepping commands in parallel debuggers. It reviews the main existing schemes (synchronous and asynchronous step implementations) and introduces a new kind of synchronous s...
详细信息
ISBN:
(纸本)0769521320
The article is devoted to the concept of the stepping commands in parallel debuggers. It reviews the main existing schemes (synchronous and asynchronous step implementations) and introduces a new kind of synchronous scheme that has several advantages over existing ones. In this scheme the debugger performs the dynamic analysis of the program state thus simplifying the program state presentation and control. The possible implementation of suggested scheme in MPI debuggers is discussed and the existing implementation in mpC Workshop parallel debugger is presented.
暂无评论