In order to improve the performance of applications on OpenMP/JIAJIA, we present a new abstraction, Array Relation Vector (ARV), to describe the relation between the data elements of two consistent shared arrays acces...
详细信息
In order to improve the performance of applications on OpenMP/JIAJIA, we present a new abstraction, Array Relation Vector (ARV), to describe the relation between the data elements of two consistent shared arrays accessed in one computation phase. Based on ARV, we use array grouping to eliminate the pseudo data distributing of small shared data and improve the page locality. Experimental results show that ARV-based array grouping can greatly improve the performance of applications with non-continuous data access and strict access affinity on OpenMP/JIAJIA cluster. For applications with small shared arrays, array grouping can improve the performance obviously when the processor number is small.
The purpose of this research was to construct an adaptive test on the computer. Adaptive testing is a new strategy of evaluation for computer-assisted learning and e-learning. Adaptive testing provides more efficient ...
详细信息
The purpose of this research was to construct an adaptive test on the computer. Adaptive testing is a new strategy of evaluation for computer-assisted learning and e-learning. Adaptive testing provides more efficient test administration and intelligent learning evaluation. It is expected to increase the accuracy of estimating the learners true ability with taking less appropriate selecting questions for individuals. Item response theory (IRT) is the main theoretical base to make tests adaptive and feasible. Adaptive testing requires high speed calculation to process the complicated IRT functions, which is fortunately the advantage of computers.
In this paper, TOPAS-a new parallel programming environment for distributed systems-is presented. TOPAS automatically analyzes data dependence among tasks and synchronizes data, which reduces the time needed for paral...
详细信息
In this paper, TOPAS-a new parallel programming environment for distributed systems-is presented. TOPAS automatically analyzes data dependence among tasks and synchronizes data, which reduces the time needed for parallel program developments. TOPAS also provides supports for scheduling, dynamic load balancing and fault tolerance. Experiments show simplicity and efficiency of parallel programming in TOPAS environment with fault-tolerant integration, which provides graceful performance degradation and quick reconfiguration time for application recovery.
The OpenMP application programming interface is an emerging standard for parallel programming on shared-memory multiprocessors. Recently, OpenMP is attracting widespread interest because of its easy-to-use portable pa...
详细信息
The OpenMP application programming interface is an emerging standard for parallel programming on shared-memory multiprocessors. Recently, OpenMP is attracting widespread interest because of its easy-to-use portable parallel programming model. In this paper, we describe a brief introduction of OpenMP API and its parallel programming. We present our Omni OpenMP complier and performance of some applications on a shared memory multiprocessor. In the end, a role of OpenMP for modern on-chip multiprocessors is discussed.
In 2002, Japan announced the Earth Simulator - a supercomputer based on low-volume vector processors and a custom network - and reported that computational scientists had used it to achieve 14.9 TFLOPS with the IMPACT...
详细信息
In 2002, Japan announced the Earth Simulator - a supercomputer based on low-volume vector processors and a custom network - and reported that computational scientists had used it to achieve 14.9 TFLOPS with the IMPACT-3D code, which is written in high performance Fortran (HPF). Of particular interest was that they had achieved this level of performance using a high-level parallel programming model. There has been considerable concern in the U.S. about the appropriateness of its hardware and software investments in super computing technology. To help assess the U.S. strategy of building systems from commodity-off-the-shelf (COTS) components, we explored using a combination of HPF and scalar compiler technology to tailor IMPACT-3D to microprocessor-based supercomputers and evaluated its performance and scalability on the AlphaServer-based Lemieux cluster at the Pittsburgh Supercomputer Center (PSC). On the Earth Simulator, IMPACT-3D achieved 45% of peak performance on 4096 processors; on 1024 processors of PSC's Lemieux, we achieved 17.29% of peak performance.
Developing parallel software is far more complex than traditional sequential software. An effective approach to deal with the complexity of parallel software is domain-specific programming in an abstraction higher tha...
详细信息
Developing parallel software is far more complex than traditional sequential software. An effective approach to deal with the complexity of parallel software is domain-specific programming in an abstraction higher than general-purpose programming languages. In this paper, we focus on the domain of the applications based on partial differential equations (PDE) and provide a formal framework and methods for PDE compilers to generate parallel iterative codes for the domain. We also provide a PDE compiler optimization to minimize the number of messages between parallel processors. Our framework and methods can be used to build PDE compilers to generate efficient parallel software for PDE-based applications automatically.
The objective of this paper is to present a cost-effective fault diagnosis methodology for flash memory. Flash memory is enjoying a rapid market growth. The research for flash memory testing is mainly to reduce the te...
详细信息
ISBN:
(纸本)0769523145
The objective of this paper is to present a cost-effective fault diagnosis methodology for flash memory. Flash memory is enjoying a rapid market growth. The research for flash memory testing is mainly to reduce the test cost and improve the production yield. In this paper, we propose a fault diagnosis flow for flash memory. We also propose a flexible built-in self-diagnosis (BISD) design with enhanced test mode control, which reduces the test time and diagnostic data shift-out cycles by using parallel programming and erasure and employing a parallel shift-out mechanism. The area overhead of our BISD circuit is only about 0.5% for a 256Mb commodity flash memory chip. Experimental results from industrial chips show that the proposed diagnosis methodology has high accuracy in distinguishing the fault type.
Reference counting is the memory management technique of most widespread use today. This paper presents a new multi-processor architecture for parallel cyclic reference counting. In this architecture, there is no dire...
详细信息
Reference counting is the memory management technique of most widespread use today. This paper presents a new multi-processor architecture for parallel cyclic reference counting. In this architecture, there is no direct mutator-collector communication and synchronization is kept minimal.
The paper discusses a problem of the use of semaphores to solve parallel programming problems. It proposes a new semaphore mechanism of extended features. An example of an application of such a semaphore for controlli...
详细信息
The paper discusses a problem of the use of semaphores to solve parallel programming problems. It proposes a new semaphore mechanism of extended features. An example of an application of such a semaphore for controlling a technological process is introduced and discussed. Simulation results and advantages of applying the new mechanism are presented.
Following the knowledge provided by the theory of programming, we present an abstract syntax of the membrane systems, and their semantics. We define an appropriate notion of configurations, and sets of inference rules...
详细信息
Following the knowledge provided by the theory of programming, we present an abstract syntax of the membrane systems, and their semantics. We define an appropriate notion of configurations, and sets of inference rules corresponding to the three stages of an evolution step in membrane systems. A notion of bisimulation is defined; bisimulation relations allow to compare the evolution behaviour of two membrane systems. On the other hand, the practice of programming related to membrane systems is given by the presentation of some sequential and parallel software simulators, emphasizing their specific features.
暂无评论