We describe a set of software utilities designed to facilitate the writing of parallel codes and porting sequential ones. Emphasis is placed on portability so that code can be developed simultaneously on a sequential ...
详细信息
Nowadays, performance in processors is increased by adding more cores or wider vector units, or by combining accelerators like GPUs and traditional cores on a chip. programming for these diverse architectures is a cha...
详细信息
parallel programming is an excellent way to speed up computation due to the simultaneous execution of the processes so that the operation is divided into the available threads. OpenMP, available for C, C++, and Fortra...
详细信息
The classical solution of electromagnetic problems using the finite element (FE) method needs to assemble, store and solve an Ax = b matrix system. A new technique for solving FE cases, considered much simpler than tr...
详细信息
In the sciences, it is common to use the so-called "big operator" notation to express the iteration of a binary operator (the reducer) over a collection of values. Such a notation typically assumes that the ...
详细信息
Complex developmental systems are constructed, the different parts of which develop simultaneously due to parallel programming of their respective developments.
Complex developmental systems are constructed, the different parts of which develop simultaneously due to parallel programming of their respective developments.
Array algorithms where operations are applied to disjoint parts of an array lend themselves well to parallelism, since parallel threads can operate on the parts of the array without synchronisation. However, implement...
详细信息
We show that program synthesis can generate GPU algorithms as well as their optimized implementations. Using the scan kernel as a case study, we describe our evolving synthesis techniques. Relying on our synthesizer, ...
详细信息
The article describes the process of computing the Z-transform neural network on the basis of input and output signals of analyzed object. parallel algorithms for performing these calculations are presented and differ...
详细信息
The article describes the process of computing the Z-transform neural network on the basis of input and output signals of analyzed object. parallel algorithms for performing these calculations are presented and different parallel architectures with different number of processors showing their advantages and limitations are analyzed. (C) 2019 Elsevier B.V. All rights reserved.
Context: Writing software for the current generation of parallel systems requires significant programmer effort, and the community is seeking alternatives that reduce effort while still achieving good performance. Obj...
详细信息
Context: Writing software for the current generation of parallel systems requires significant programmer effort, and the community is seeking alternatives that reduce effort while still achieving good performance. Objective: Measure the effect of parallel programming models (message-passing vs. PRAM-like) oil programmer effort. Design, setting. and subjects: One group of subjects implemented sparse-matrix dense-vector multiplication using message-passing (MPI), and a second group solved the same problem using a PRAM-like model (XMTC). The subjects were students in two graduate-level classes: one class was taught MPI and the other was taught XMTC. Main outcome measures: Development time, program correctness. Results: Mean XMTC development time was 4.8 h less than mean MPI development time (95% confidence interval, 2.0-7.7), a 46% reduction. XMTC programs were more likely to be correct, but the difference in correctness rates was not statistically significant (p = .16). Conclusions: XMTC Solutions for this particular problem required less effort than MPI equivalents, but further Studies are necessary which examine different types of problems and different levels of programmer experience. (C) 2008 Elsevier Inc. All rights reserved.
暂无评论