Ultralightweight Thread (uThread) is a library package designed and optimized for user-level management of parallelism in a single application program running on distributed-memorycomputers. Existing process manageme...
详细信息
Ultralightweight Thread (uThread) is a library package designed and optimized for user-level management of parallelism in a single application program running on distributed-memorycomputers. Existing process management systems incur an unnecessarily high cost when used for the type of parallelism exploited within an application. By reducing the overhead of ownership protection and frequent context switches, uThread encourages both simplicity and performance. In addition, uThread provides various scheduling support to balance the system load. The uThread package reduces the cost of parallelism management to nearly the lower bound. This package has been successfully running on most distributed-memorycomputers, such as the Intel iPSC/860, Touchstone Delta. NCUBE, and TMC CM-5.
The ParaScope parallel programming environment, developed to support scientific programming of shared-memory multiprocessors, includes a collection of tools that use global program analysis to help users develop and d...
详细信息
The ParaScope parallel programming environment, developed to support scientific programming of shared-memory multiprocessors, includes a collection of tools that use global program analysis to help users develop and debug parallel programs. This paper focuses on ParaScope's compilation system, its parallel program editor, and its parallel debugging system. The compilation system extends the traditional single-procedure compiler by providing a mechanism for managing the compilation of complete programs. Thus, ParaScope can support both traditional single-procedure optimization and optimization across procedure boundaries. The ParaScope editor brings both compiler analysis and user expertise to bear on program parallelization. It assists the knowledgeable user by displaying and managing analysis and by providing a variety of interactive program transformations that are effective in exposing parallelism. The debugging system detects and reports timing-dependent errors, called data races, in execution of parallel programs. The system combines static analysis, program instrumentation, and run-time reporting to provide a mechanical system for isolating errors in parallel program executions. Finally, we describe a new project to extend ParaScope to support programming in Fortran D, a machine-independent parallel programming language intended for use with both distributed-memory and shared-memoryparallelcomputers.
In this paper we consider parallel numerical integration algorithms for multi-dimensional integrals. A new hyper-rectangle selection strategy is proposed for the implementation of globally adaptive parallel quadrature...
详细信息
In this paper we consider parallel numerical integration algorithms for multi-dimensional integrals. A new hyper-rectangle selection strategy is proposed for the implementation of globally adaptive parallel quadrature algorithms. The well known master-slave parallel algorithm prototype is used for the realization of the algorithm. Numerical results on the SP2 computer and on a cluster of workstations are reported. A test problem where the integrand function has a strong corner singularity is investigated. A modified parallel integration algorithm is proposed in which a list of subproblems is distributed among slave processors.
In this paper we describe implementation of numerical adaptive algorithms for multi-dimensional quadrature on distributed-memoryparallel systems. The algorithms are targeted at clusters of workstations with standard ...
详细信息
In this paper we describe implementation of numerical adaptive algorithms for multi-dimensional quadrature on distributed-memoryparallel systems. The algorithms are targeted at clusters of workstations with standard message passing interfaces, e.g., PVM or MPI. The most important issues are communication and load balancing. Static and dynamic partitioning of the region are considered. Numerical results on various workstation clusters are reported.
暂无评论