Data parallel language was suggested to solve programming problems of distributed memory machines in terms of programming language. Among data parallel languages, HPF is a standard data parallel language across a vari...
详细信息
ISBN:
(纸本)0818678704
Data parallel language was suggested to solve programming problems of distributed memory machines in terms of programming language. Among data parallel languages, HPF is a standard data parallel language across a variety of high-performance architectures. Most HPF compilers are source-to-source translators because they can be easily implemented. However, these source-to-source compilers produce significant amount of ineffective codes. In particular, the FORALL construct is converted into several DO loops, so its loop overhead is increased. Therefore, we propose some techniques for converting FORALL construct to optimized DO loop. For this, we define and use relation distance vector which can represent both data dependence information and flow information. Then we evaluate and analyze execution time for the codes converted by our method and by PARADIGM method.
The role and the evolution of software engineering environment (SEE) and computer-aided software engineering (CASE) environments in large software development and maintenance are discussed. The drawbacks of the curren...
详细信息
The role and the evolution of software engineering environment (SEE) and computer-aided software engineering (CASE) environments in large software development and maintenance are discussed. The drawbacks of the current state of the art are emphasized. Improved structural and behavioral modeling are advocated, as is the introduction of team support to meet software engineering requirements. The different trends in SEE are addressed with emphasis on the evolution towards more powerful data modeling and the integration of process models.< >
We compare automatically and manually parallelized NAS Benchmarks in order to identify code sections that differ. We discuss opportunities for advancing automatic parallelizers. We find ten patterns that pose challeng...
详细信息
We describe the design of an extensible kernel, called Paramecium. This kernel uses an object-based software architecture which together with instance naming, late binding and explicit overrides enables easy reconfigu...
详细信息
We describe the design of an extensible kernel, called Paramecium. This kernel uses an object-based software architecture which together with instance naming, late binding and explicit overrides enables easy reconfiguration. Determining which components reside in the kernel protection domain is up to the user. A certification authority or one of its delegates certifies which components are trustworthy and therefore permitted to run in the kernel protection domain. These delegates may include validation programs, correctness provers, and system administrators. The main advantage of certifications is that it can handle trust and sharing in a non-cooperative environment.
The goals for the Computation Oriented Display Environment (CODE) are to provide a representation power sufficient for facile expression of a wide class of parallel algorithms while at the same time permitting compila...
详细信息
The goals for the Computation Oriented Display Environment (CODE) are to provide a representation power sufficient for facile expression of a wide class of parallel algorithms while at the same time permitting compilation to reasonably efficient programs on a wide spectrum of parallel execution environments and to provide a hierarchical approach to development of parallel programs. CODE is based on a formally specified model of parallel computation which covers most conventional MIMD models of parallel computation. The model is formulated at a higher level of abstraction than conventional MIMD shared-name-space and partitioned-name-space models of parallel computation. The conceptual foundation of CODE, in particular basing the language on an abstract model of parallel computation, has led to two significant capabilities which had not been anticipated: a calculus of composition which may be exploitable for automated or semiautomated program construction and a natural basis for highly effective component reuse.< >
The RELACS language is a systolic programming language, which simplifies the programmer's task by making explicit the data-flow of systolic algorithms, and by exposing the data delivery mechanism. The underlying a...
详细信息
The RELACS language is a systolic programming language, which simplifies the programmer's task by making explicit the data-flow of systolic algorithms, and by exposing the data delivery mechanism. The underlying architecture model is different from other SIMD architectures in that it physically separates computation and data management. The authors introduce the RELACS language as a syntaxic and a sermantic extension of the C language. It is shown that the RELACS programming model provides a simple programming method for systolic algorithms, which is applicable to a variety of parallel machines.< >
The authors present Data parallel Fortran (DPF), a set of extensions to Fortran aimed at programming scientific applications on a variety of parallel machines. DPF portrays a global name space to programmers and allow...
详细信息
The authors present Data parallel Fortran (DPF), a set of extensions to Fortran aimed at programming scientific applications on a variety of parallel machines. DPF portrays a global name space to programmers and allows programs to be written in a clear, data-parallel style. DPF's model is based on the idea of having a single control thread that spans parallel virtual threads with arbitrary nesting, resuming at their completion into a single global state. It also provides explicit control of which subset of the global name space is strictly accessed by each virtual processor at different points in a program. This powerful mechanism makes it possible to write programs in which communication points are handled explicitly, but without making use of message passing code. Also, DPF offers some primitives that involve communication often encountered in parallel numerical and scientific applications. DPF semantics does not depend on any particular feature of the architecture, thus providing a reasonably high-level programming methodology.< >
Management of the communications among a set of concurrent processes arises in many applications and is a central concern in parallel computing. The authors introduce Manifold: a language whose sole purpose is to desc...
详细信息
Management of the communications among a set of concurrent processes arises in many applications and is a central concern in parallel computing. The authors introduce Manifold: a language whose sole purpose is to describe and manage complex interconnections among independent, concurrent processes. In the underlying paradigm of this language the primary concern is not with what functionality the individual processes in a parallel system provide; the emphasis is on how these processes are interconnected and how their interaction patterns change during the execution life of the system. As an example of the application of Manifold, the authors describe a simple window system and show how the communications between clients running on different windows and a window server can be described in this language.< >
In this paper, we present a parallel algorithm running on a shared memory multi-processor workstation for timing driven standard cell layout. The proposed algorithm is based on POPINS2.0 and consists of three phases. ...
详细信息
In this paper, we present a parallel algorithm running on a shared memory multi-processor workstation for timing driven standard cell layout. The proposed algorithm is based on POPINS2.0 and consists of three phases. First, we get an initial placement by a hierarchical timing-driven mincut placement algorithm. At the top level of partitioning hierarchy, we perform one step of bi-partitioning by several processors, and in the lower levels of partitioning hierarchy, partitionings of each region in a level are performed in parallel. Next, in phase 2, iterative improvement of the sub-circuit which contains critical paths is performed by nonlinear programming. parallel processing is realized by performing the nonlinear programming method to each sub-circuit in parallel. Finally, in phase 3, the placement is transformed to a row based layout style by a timing-driven row assignment method. We have implemented the proposed method on a 4CPU multi-processor workstation and showed that the proposed method is promising through experimental results.
The traditional DSP design and development environment is generally based on the use of a single processor and suffers from a number of limitations including the following: a time-consuming design cycle from specifica...
详细信息
The traditional DSP design and development environment is generally based on the use of a single processor and suffers from a number of limitations including the following: a time-consuming design cycle from specification to the final product development; dependence on hardware and hence lack of portability to different DSP platforms; lack of exploitation of algorithmic and architectural parallelism for DSP design. The authors present a multiprocessor environment, called Taurus, for design and development of DSP algorithms and applications. The important features of such environment are introduced, and typical experimental results using an ADPCM system are presented and discussed.< >
暂无评论