Derivatives of almost arbitrary functions can be evaluated efficiently by automatic differentiation whenever the functions are given in the form of computer programs in a high-level programming language such as Fortra...
详细信息
Derivatives of almost arbitrary functions can be evaluated efficiently by automatic differentiation whenever the functions are given in the form of computer programs in a high-level programming language such as Fortran, C, or C++. In contrast to numerical differentiation, where derivatives are only approximated, automatic differentiation generates derivatives that are accurate up to machine precision. Sophisticated software tools implementing the technology of automatic differentiation are capable of automatically generating code for the product of the Jacobian matrix and a so-called seed matrix. It is shown how these tools can benefit from concepts of shared memory programming to parallelize, in a completely mechanical fashion, the gradient operations associated with each statement of the given code. The feasibility of our approach is demonstrated by numerical experiments. They were performed with a code that was generated automatically by the Adifor system and augmented with OpenMP directives.
Thread level parallelism (TLP) is a key technology for next-generation high performance processors. Although it provides higher processing capability, the loss of compatibility with existing processors is a crucial is...
详细信息
Thread level parallelism (TLP) is a key technology for next-generation high performance processors. Although it provides higher processing capability, the loss of compatibility with existing processors is a crucial issue. This research is motivated by the following two points: (1) TLP requires multithread programming which is rather difficult for ordinary programmers, or complex compilation technologies that can exploit multithread parallelism, and (2) existing binary codes should be executed efficiently on multithreaded processors. In this paper, we first propose a binary translation system, that translates existing binary codes to multithreaded ones and optimizes them dynamically during execution. The system inputs the original binary codes and translates them to internal RTL representation. It analyzes the structure of the program and applies multithreading to loop bodies in a thread pipelining manner. A pilot binary translator, that is a part of the proposed system, was built for the sake of preliminary evaluation. Evaluation results illustrate effectiveness of the system.
For too long computer programming has been treated as an art or a craft rather than as a science or an engineering discipline. The Kernel Language approach provides a precise and concise basis for programming in all p...
详细信息
For too long computer programming has been treated as an art or a craft rather than as a science or an engineering discipline. The Kernel Language approach provides a precise and concise basis for programming in all paradigms (imperative, logical, functional and object-oriented) as well as for parallel, concurrent and distributed multi-thread programming. The Kernel Language is implemented as a subset of Oz, a powerful, multi-paradigm programming language that is similar to Java. This allows us to apply the theory to enhance the art of practical problem solving. KL allows us to introduce multi-thread programming and the major programming paradigms in first courses of programming. With the rapidly expanding acceptance of multi-language programming capabilities of dotNET, a revision of traditional introductory programming courses becomes more and more important.
The CO2P3S parallel programming system uses design patterns and object-oriented programming to reduce the complexities of parallel programming. The system generates correct frameworks from pattern template specificati...
详细信息
ISBN:
(纸本)1880446359
The CO2P3S parallel programming system uses design patterns and object-oriented programming to reduce the complexities of parallel programming. The system generates correct frameworks from pattern template specifications and provides a layered programming model to address both the problems of correctness and openness. This paper describes the highest level of abstraction in CO2P3S, using two example programs to demonstrate the programming model and the supported patterns. Further, we introduce phased parallel design patterns, a new class of patterns that allow temporal phase relationships in a parallel program to be specified, and provide two patterns in this class. Our results show that the frameworks can be used to quickly implement parallel programs, reusing sequential code where possible. The resulting parallel programs provide substantial performance gains over their sequential counterparts.
We present an integrated environment for the systematic development of parallel and distributed programs. Our approach allows the user to construct complex applications by composing and transforming skeletons, i.e., r...
详细信息
ISBN:
(纸本)3540663630
We present an integrated environment for the systematic development of parallel and distributed programs. Our approach allows the user to construct complex applications by composing and transforming skeletons, i.e., recurring patterns of task and data parallelism. First academic and commercial experience with skeleton-based systems has demonstrated the benefits of the approach but also the lack of a dedicated set of methods for algorithm design and performance prediction. We take a first step towards such a set of methods by proposing an environment which integrates a framework for algorithm transformation, called FAN, with two existing skeleton-based programming systems: the academic system P3L and its commercial counterpart SkIE.
None of the current attempts to provide an Internet-wide global computing infrastructure presents well-defined programming constructs such as object distribution, dispatching, migration and concurrency with maximum po...
详细信息
This paper describes the implementation of the Condensed Graph (CG) Computing Model using the PVM system. This model enables the programmer to write solutions to problems to run on a PVM System without the programmer ...
详细信息
parallel programming continues to be difficult, despite substantial and ongoing research aimed at making it tractable. Especially dismaying is the gulf between theory and the practical programming. We propose a struct...
详细信息
暂无评论