Orléans Skeleton Library (OSL) is a library of parallel algorithmic skeletons in C++ on top of MPI. It provides a structured approach towards parallel programming. Skeletons in OSL are based over the bulk synchro...
详细信息
Sequential programs are often difficult to parallelize because of the complexity in their implementation and the uncertainty in their behavior. We will demonstrate behavior-oriented parallelization (BOP), which provid...
详细信息
As the open standard for parallel programming of heterogeneous systems, OpenCL has been used in this study in the context of a particular and intensive computing task, namely the voxelization of tessellated objects. F...
详细信息
Actor model-based design is actively researched for parallel embedded SW design since the model exposes the potential parallelism explicitly in an architecture-neutral form. In most actor-oriented models, actors are s...
详细信息
Actor model-based design is actively researched for parallel embedded SW design since the model exposes the potential parallelism explicitly in an architecture-neutral form. In most actor-oriented models, actors are self-contained and data channels are the only sharable object between actors, and they compose a system in a flat layer. In contrast, it is common to use shared library functions and construct vertically layered software for efficiency and modularity. To fill this gap between modeling and implementation, we propose a special actor, library task, with new types of ports: library master port and library slave port. It is a sharable and mappable object that defines a set of function interfaces inside. N:1 master-slave connection allows sharing a library task and the master-slave connection can specify vertically layered software and client-server applications naturally. To support the library task in our embedded software design environment, we develop an automatic mapping algorithm as well as an automatic code generator. The design environment with the library task is applied for two target platforms: IBM CELL Broadband Engine and an ARM-based multicore simulator. Preliminary experiments show that the special actor, or library task, extends the expression power of the previous actor model with efficiently generated codes.
We present methods that can dramatically improve numerical consistency for parallel calculations across varying numbers of processors. By calculating global sums with enhanced precision techniques based on Kahan or Kn...
详细信息
We present methods that can dramatically improve numerical consistency for parallel calculations across varying numbers of processors. By calculating global sums with enhanced precision techniques based on Kahan or Knuth summations, the consistency of the numerical results can be greatly improved with minimal memory and computational cost. This study assesses the value of the enhanced numerical consistency in the context of general finite difference or finite volume calculations. (C) 2011 Elsevier B.V. All rights reserved.
The year 2017 marks the 15th anniversary of the SCIT supercomputer project, which allows us to summarize the results and draw conclusions. In this paper, we discuss the evolution of SCIT architecture and statistics of...
详细信息
The year 2017 marks the 15th anniversary of the SCIT supercomputer project, which allows us to summarize the results and draw conclusions. In this paper, we discuss the evolution of SCIT architecture and statistics of the supercomputing center for years 2002-2017. These data will be useful for computer cluster developers and researchers who design resource management algorithms for computing clusters.
Summarising distributed data is a central routine for parallel programming, lying at the core of widely used frameworks such as the map/reduce paradigm. In the IoT context it is even more crucial, being a privileged m...
详细信息
Provides an abstract of the tutorial presentation and may include a brief professional biography of the presenter. The complete presentation was not made available for publication as part of the conference proceedings.
Provides an abstract of the tutorial presentation and may include a brief professional biography of the presenter. The complete presentation was not made available for publication as part of the conference proceedings.
Provides an abstract of the invited presentation and may include a brief professional biography of the presenter. The complete presentation was not made available for publication as part of the conference proceedings.
ISBN:
(纸本)9781538655566;9781538655559
Provides an abstract of the invited presentation and may include a brief professional biography of the presenter. The complete presentation was not made available for publication as part of the conference proceedings.
Task parallelism is designed to simplify the task of parallel programming. When executing a task parallel program on modern NUMA architectures, it can fail to scale due to the phenomenon called work inflation, where t...
详细信息
暂无评论