This paper describes an experimental message-driven programming system for fine-grain multicomputers. The initial target architecture is the J-machine designed at MIT. This machine combines a unique collection of arch...
详细信息
This paper describes an experimental message-driven programming system for fine-grain multicomputers. The initial target architecture is the J-machine designed at MIT. This machine combines a unique collection of architectural features that include fine-grain processes, on-chip associative memory;and hardware support for process synchronization. The programming system uses these mechanisms via a simple message-driven process model that blurs the distinction between processes and messages: messages correspond to processes that are executed elsewhere in the network. This model allows code and data to be distributed across the computers in the machine, and is supported at every stage of the program development cycle. The prototype system we have developed includes a basic set of programming tools to support the model;these include a compiler, linker, archiver, loader and microkernel. Although the concepts are language independent, our prototype system is based on GNU-C.
This paper describes the scalableconcurrentprogramming Library (SCPlib), basic technology that supports irregular applications on scalableconcurrent hardware and heterogeneous computing environments. The library is...
详细信息
This paper describes the scalableconcurrentprogramming Library (SCPlib), basic technology that supports irregular applications on scalableconcurrent hardware and heterogeneous computing environments. The library is optimized to take advantage of the best available underlying communication and synchronization on a variety of high-performance multicomputers, shared-memory multiprocessors, and networked PCs and workstations. It also provides a framework for heterogeneous communication and file I/O, load balancing, and dynamic granularity control. The effectiveness of the library has been demonstrated on a variety of industrial strength applications.
We address the problem of performing a pipelined broadcast on a mesh architecture. Meshes require a different approach than other topologies, and their very nature puts a tighter bound on the performance that one can ...
详细信息
We address the problem of performing a pipelined broadcast on a mesh architecture. Meshes require a different approach than other topologies, and their very nature puts a tighter bound on the performance that one can hope to achieve. By using the appropriate techniques, however, one can obtain excellent performance for sufficiently long messages. The resulting algorithm will work on meshes of any dimension with any number of nodes. Our model assumes that the mesh is a torus and/or that it has bidirectional links and uses wormhole routing. Performance data from the Cray T3D are included.
We describe the implementation of the cell multipole method (CMM) in a complete molecular dynamics (MD) simulation program (MPSim) for massively parallel supercomputers. Tests are made of how the program scales with s...
In the paper we give a straightforward, highly efficient, scalable implementation of common matrix multiplication operations. The algorithms are much simpler than previously published methods, yield better performance...
This paper describes computational techniques for concurrent Direct Simulation Monte Carlo (DSMC) of neutral flow inside three-dimensional plasma reactors. These techniques are designed to reduce the overall time to o...
详细信息
暂无评论