A programming platform that can gain the processing power of a network of idle workstations for distributedprocessing is presented. It allows users in a workstation to develop and execute distributedapplications. Jo...
详细信息
Although high-level real-time distributed computing objects are generally written in forms independent of execution platforms, input and output (I/O) activities involving peripherals are inherently platform-dependent....
详细信息
ISBN:
(纸本)0769515584
Although high-level real-time distributed computing objects are generally written in forms independent of execution platforms, input and output (I/O) activities involving peripherals are inherently platform-dependent. Yet, writing parts of real-time objects for controlling peripherals should be done in forms compatible with the adopted real-time object programming styles. Basic issues are discussed in the context of an object-oriented real-time programming scheme called the time-triggered message-triggered object (TMO) programming scheme. A desirable goal here is to facilitate both commanding and reactive control of peripherals in TMOs in general forms while enabling relatively easy analysis of the timing behavior of such TMOs. This paper presents several techniques to meet these requirements.
Advanced Encryption Standard (AES), as one of the most popular encryption algorithms, has been widely studied on single GPU and CPU. However, the research on multi-GPU platforms is not deep enough, and with the rapid ...
详细信息
Recently, many parallel computing models using dynamically reconfigurable electrical buses have been proposed in the literature. The underlying characteristics are similar among these models, but they do have certain ...
详细信息
We present a parallel algorithm for performing multipoint linkage analysis of genetic marker data on large family pedigrees. The algorithm effectively distributes both the computation and memory requirements of the an...
Researchers must often write their own simulation and analysis software. During this process they simultaneously confront both computational and scientific problems. Current strategies for aiding the generation of per...
详细信息
In this paper, we show the potential benefits of translating OpenMP code to low-level parallel code using a data flow execution model, instead of targeting it directly to a multi-threaded program. Our goal is to impro...
In this paper, we discuss our experience of providing high performance parallel I/O for a large-scale, on-going, multi-disciplinary simulation project for solid propellant rockets. We describe the performance and data...
详细信息
It is a challenging issue whether scientific applications are suitable for Imagine architecture. To address this problem, this paper presents a novel architecture-based optimization for the key techniques of mapping s...
详细信息
ISBN:
(纸本)9783540747413
It is a challenging issue whether scientific applications are suitable for Imagine architecture. To address this problem, this paper presents a novel architecture-based optimization for the key techniques of mapping scientific applications to Imagine. Our specific contributions include that we achieve fine kernel granularity and choose necessary arrays to organize appropriate streams. Specially, we develop a new stream program generation algorithm based on the architecture-based optimization. We implement our algorithm to some representative scientific applications on ISIM simulation of Imagine, compared the corresponding FORTRAN programs running on Itanium 2. The experimental results show that the optimizing stream programs can efficiently improve computational intensiveness, enhance locality of LRF and SRF, avoid index stream overhead and enable parallelism to utilize ALUs. It is certain that Imagine is efficient for many scientific applications.
High Efficiency Video Coding (HEVC) creates the conditions for cost-effective video transmission and storage but its inherent computational complexity calls for efficient parallelization techniques. This paper provide...
详细信息
ISBN:
(纸本)9781728133201
High Efficiency Video Coding (HEVC) creates the conditions for cost-effective video transmission and storage but its inherent computational complexity calls for efficient parallelization techniques. This paper provides HEVC encoders with a holistic parallelization scheme that exploits parallelism at data, thread, and process levels at the same time. The proposed scheme is implemented in the practical Kvazaar open-source HEVC encoder. It makes Kvazaar exploit parallelism at three levels: 1) Single Instruction Multiple Data (SIMD) optimized coding tools at the data level;2) Wavefront parallelprocessing (WPP) and Overlapped Wavefront (OWF) parallelization strategies at the thread level;and 3) distributed slice encoding on multi-computer systems at the process level. Our results show that the proposed process-level parallelization scheme increases the coding speed of Kvazaar by 1.86x on two computers and up to 3.92x on five computers with +0.19% and +0.81% coding losses, respectively. Exploiting all these three parallelism levels on a five-computer setup gives almost a 25x speedup over a non-parallelized single-core implementation.
暂无评论