the Itanium processor, an implementation of an Explicitly Parallel Instruction computing (EPIC) architecture, is an in-order processor that fetches, executes, and forwards results to functional units in-order. the arc...
详细信息
the molecular dynamics code CHARMM is a popular research tool for computational biology. An increasing number of researchers are currently looking for affordable and adequate platforms to execute CHARMM or similar cod...
详细信息
Weighted reference counting is a very simple and efficient memory management system for multiprocessor architectures. this paper extends the weighted reference counting algorithm to work efficiently with cyclic data s...
详细信息
Weighted reference counting is a very simple and efficient memory management system for multiprocessor architectures. this paper extends the weighted reference counting algorithm to work efficiently with cyclic data structures.
the goal of this research is to develop performance profiles of parallel and distributed applications in order to predict their execution time under different network conditions. this paper measures the resource requi...
In this paper, we examine some of the challenges present in providing support for OpenMP applications on a Software Distributed Shared Memory(DSM) based cluster system. We present detailed measurements of the performa...
详细信息
Derivatives of almost arbitrary functions can be evaluated efficiently by automatic differentiation whenever the functions are given in the form of computer programs in a high-level programming language such as Fortra...
详细信息
Derivatives of almost arbitrary functions can be evaluated efficiently by automatic differentiation whenever the functions are given in the form of computer programs in a high-level programming language such as Fortran, C, or C++. In contrast to numerical differentiation, where derivatives are only approximated, automatic differentiation generates derivatives that are accurate up to machine precision. Sophisticated software tools implementing the technology of automatic differentiation are capable of automatically generating code for the product of the Jacobian matrix and a so-called seed matrix. It is shown how these tools can benefit from concepts of shared memory programming to parallelize, in a completely mechanical fashion, the gradient operations associated with each statement of the given code. the feasibility of our approach is demonstrated by numerical experiments. they were performed with a code that was generated automatically by the Adifor system and augmented with OpenMP directives.
暂无评论