Data-parallel applications are usually programmed in the SPMD paradigm by using a message passing system such as MPI or PVM. However programming by using message passing primitives is still tedious and error-prone. Th...
详细信息
ISBN:
(纸本)0818678763
Data-parallel applications are usually programmed in the SPMD paradigm by using a message passing system such as MPI or PVM. However programming by using message passing primitives is still tedious and error-prone. This paper presents an abstraction of message passing programming in C++ to relieve programmers of low-level considerations. The runtime overhead introduced by the abstraction is shown to be negligible.
Symbolic applications such as expert systems, theorem provers, and computer algebra exhibit dynamic, tree-structured behavior with respect to control and data structures. This is why it is difficult to parallelize a p...
Symbolic applications such as expert systems, theorem provers, and computer algebra exhibit dynamic, tree-structured behavior with respect to control and data structures. This is why it is difficult to parallelize a program and get it running efficiently on a parallel computer, especially one with distributed memory. This paper introduces a semi-automatic mapping environment providing a set of support tools, intended for application to large, real-life programs. Mapping can perform adaptive granularity control, dynamic load balancing, and scheduling on parallel programs with dynamic data and control behavior, providing a set of strategies for all components. A set of mapping rules are extracted, describing when which strategy is appropriate. The approach systematically selects and configures its strategies to suit the characteristics of the application and is thus superior to a universal heuristic. (C) 1996 Academic Press Limited
The sustained performance of superscalar microprocessors amounts to only a fraction of their peak performance rating. In parallel computers realized with them this discrepancy is even more dramatic. Reaching a satisfa...
详细信息
The sustained performance of superscalar microprocessors amounts to only a fraction of their peak performance rating. In parallel computers realized with them this discrepancy is even more dramatic. Reaching a satisfactory sustained performance for the single processor is mainly a compiler problem. The sustained performance of parallel computers depends also on other components of the architecture such as the interconnect and the operating system. It is shown how, through a combination of innovative architectural solutions, the sustained performance of a distributed memory parallel computer can be significantly improved. The key to effective latency hiding by overlapping communication and computation is the operating system. The programmability of such architectures can be enhanced by providing the programmer with parallelizing compilers and/or a global address space provided by virtual shared memory. All these measures have been incorporated in the MANNA computer described in the paper. Benchmark performance figures obtained with it are reported.
Automatic fault diagnosis in power systems presents real challenges to computing technologies. As an alternative approach to expert systems, several neural network solutions have been proposed recently. In this paper ...
详细信息
Automatic fault diagnosis in power systems presents real challenges to computing technologies. As an alternative approach to expert systems, several neural network solutions have been proposed recently. In this paper a modular, neural network-based solution to power systems alarm handling and fault diagnosis is described that overcomes the limitations of 'toy' alternatives constrained to small and fixed-topology electrical networks. In contrast to monolithical diagnosis systems, the neural network-based approach presented here fulfills the scalability and dynamic adaptability requirements of the application. Mapping the power grid onto a set of interconnected modules that model the functional behaviour of electrical equipment provides the flexibility and speed demanded by the problem. The way in which the neural system is conceived allows full scalability to real-size power systems.
This paper describes the analysis tool DSPNexpress which has been developed at the Technische Universitat Berlin since 1991. The development of DSPNexpress has been motivated by the lack of a powerful software package...
详细信息
This paper describes the analysis tool DSPNexpress which has been developed at the Technische Universitat Berlin since 1991. The development of DSPNexpress has been motivated by the lack of a powerful software package for the numerical solution of deterministic and stochastic Petri nets (DSPNs) and the complexity requirements imposed by evaluating memory consistency models for multicomputer systems. The development of DSPNexpress has gained by the author's experience with the version 1.4 of the software package GreatSPN. However, opposed to GreatSPN, the softwarearchitecture of DSPNexpress is particularly tailored to the numerical evaluation of DSPNs. Furthermore, DSPNexpress contains a graphical interface running under the X11 window system. To the best of the author's knowledge, DSPNexpress is the first software package which contains an efficient numerical algorithm for computing steady-state solutions of DSPns.
This paper describes a generalization of the "labelling" search strategy and its application to scheduling problems. The assignment of a value to the selected variable is replaced by reduction of the domain ...
详细信息
This article outlines a graphical software package for performance and dependability modeling with deterministic and stochastic Petri nets (DSPNs). The package is called DSPNexpress because its main scientific contrib...
详细信息
This article outlines a graphical software package for performance and dependability modeling with deterministic and stochastic Petri nets (DSPNs). The package is called DSPNexpress because its main scientific contribution lies in its efficient numerical solution component. The development of DSPNexpress has been motivated by the lack of a software package for an efficient numerical analysis of Deterministic and stochastic Petri nets (DSPNs) and the complexity requirements imposed by evaluating design alternatives for hardware and software components of multicomputer systems. The main scientific contribution of DSPNexpress lies in its efficient numerical solution component. The version 1.3 of DSPNexpress solves complex DSPNs with four orders of magnitude less CPU time than the previously known numerical method. As a consequence, DSPNexpress is able to calculate steady-state solutions of complex DSPNs with reasonable computational effort on a modern workstation (e.g. a DSPN with 100000 tangible marking and 5000000 state transitions with mild stiffness can be solved in 1 hour of CPU time on a Sun Spare station 2). The article summarizes the main innovative features of the software package DSPNexpress.
SUPRENUM is a highly parallel supercomputer for numerical applications. The 5-GFLOPS peak performance of the 256-node system made it the most powerful MIMD architecture of the 'first generation.' Each node is ...
详细信息
SUPRENUM is a highly parallel supercomputer for numerical applications. The 5-GFLOPS peak performance of the 256-node system made it the most powerful MIMD architecture of the 'first generation.' Each node is a complete, single-board vector machine with 20 Mflops peak performance (IEEE double precision). SUPRENUM is a distributed memory architecture, resulting in a highly scalable system that can be made fault-tolerant. Message passing is accelerated by dedicated communication hardware in each node. Array access is performed by an 'intelligent' DMA address generator. The SUPRENUM architecture was the first to be based on two-level interconnection structure, consisting of a number of clusters with each cluster consisting of a number of nodes. At the cluster level the nodes are interconnected by two very fast parallel buses. At the system level the clusters are interconnected by a torus structure consisting of serial ring buses. The nodes run under the proprietary, distributed PEACE operating system. Significant efforts were undertaken to make the system programmable, by providing a host of software tools, libraries, and application software packages. The paper discusses the rationale for the SUPRENUM architecture, the goals achieved, and the lessons learned.
The parLisp system consists of a parallel run-time system and a set of tools for parallel symbolic processing with Lisp on the distributed memory parallel machine MANNA. The parallel run-time system provides and imple...
详细信息
The paper demonstrates the advantages of having two processors in the node of a distributed memory architecture, one for computation and one for communication. The architecture of such a dual-processor node is discuss...
详细信息
The paper demonstrates the advantages of having two processors in the node of a distributed memory architecture, one for computation and one for communication. The architecture of such a dual-processor node is discussed. To exploit fully the potential for parallel execution of computation threads and communication threads, a novel, compiler-optimized IPC mechanism allows for an unbuffered no-wait send and a prefetched receive without the danger of semantics violation. It is shown how an optimized parallel operating system can be constructed such that the application processor's involvement in communication is kept to a minimum while the utilization of both processors is maximized. The MANNA implementation results in an effective message start-up latency of only 1...4 microseconds. It is also shown how the dual-processor node is utilized to efficiently realize virtual shared memory.< >
暂无评论