作者:
Russkova, TatianaRAS
SB VE Zuev Inst Atmospher Opt 1 Academician Zuev Sq Tomsk 634055 Russia
The parallel Monte Carlo algorithms developed for numerical simulation of the polarized radiative transfer in the Earth's atmosphere are discussed. The results of their testing in the aerosol atmosphere and cloud ...
详细信息
ISBN:
(数字)9781510622920
ISBN:
(纸本)9781510622920
The parallel Monte Carlo algorithms developed for numerical simulation of the polarized radiative transfer in the Earth's atmosphere are discussed. The results of their testing in the aerosol atmosphere and cloud layer as well as the results of calculation of the Stokes vector in an environment with spatially inhomogeneous clouds are presented. Problems of improving the efficiency of the Monte Carlo simulation by transition from sequential CPU computations to parallel GPU computations are discussed. The acceleration rate of the radiation codes achieved by parallelizing computational algorithms on a graphics processor is given. It is shown that the changeover of computing from conventional PCs to the architecture of graphics processors gives remarkable increase in performance and fully reveals the capabilities of the technology used.
The Architectural Patterns for parallel programming is a collection of patterns related with a method for developing the coordination structure of parallel software systems. These architectural patterns take as input ...
详细信息
ISBN:
(纸本)9781450363877
The Architectural Patterns for parallel programming is a collection of patterns related with a method for developing the coordination structure of parallel software systems. These architectural patterns take as input information (a) the available parallel hardware platform, (b) the parallel programming language of this platform, and (c) the analysis of the problem to solve, in terms of an algorithm and data. In this paper, it is presented the application of the architectural patterns along within the Coordination stage, as part of the Pattern -based parallel Software Design Method, which aims for developing a coordination structure for solving the Laplace Equation. The Coordination stage here takes the information from the Problem Analysis presented in Section 2, selects an architectural pattern for the coordination in Section 3, and provides some elements about its implementationin Section 4.
This Research to Practice Full Paper proposes a teaching approach that introduces parallel programming early in the undergraduate Computer Science curriculum. Experiments were conducted to freshmen in the second cours...
详细信息
ISBN:
(数字)9781728117461
ISBN:
(纸本)9781728117478
This Research to Practice Full Paper proposes a teaching approach that introduces parallel programming early in the undergraduate Computer Science curriculum. Experiments were conducted to freshmen in the second course of algorithms and data structures. The strategy for the evaluation of the early education of parallel programming includes the use of OpenMP Application programming Interface and sorting algorithms. The results indicate that students improved their skills by participating in parallel programing activities introduced at early stages or even at the very beginning of the undergraduate program. Freshmen could hit about 92%, 63% and 44% of easy, medium and hard questions after theoretical and practice activities. This represents an improvement about 19%, 14% and 39% for each respective difficulty level in comparison to the beginning of the study when all freshmen had no knowledge relative to parallel programming. These results aid to demystify parallel programming and to show that freshmen can learn it.
Correctly synchronizing multithreaded programs is challenging, and errors can lead to program failures (e.g., atomicity violations). Existing memory consistency models rule out some possible failures, but are limited ...
详细信息
Correctly synchronizing multithreaded programs is challenging, and errors can lead to program failures (e.g., atomicity violations). Existing memory consistency models rule out some possible failures, but are limited by depending on subtle programmer-defined locking code and by providing unintuitive semantics for incorrectly synchronized code. Stronger memory consistency models assist programmers by providing them with easier-to-understand semantics with regard to memory access interleavings in parallel code. This dissertation proposes a new strong memory consistency model based on ordering-free regions (OFRs), which are spans of dynamic instructions between consecutive ordering constructs (e.g. barriers). Atomicity over ordering-free regions provides stronger atomicity than existing strong memory consistency models with competitive performance. Ordering-free regions also simplify programmer reasoning by limiting the potential for atomicity violations to fewer points in the program’s execution. This dissertation explores both software-only and hardware-supported systems that provide OFR serializability.
The ability to teach parallel programming principles and techniques is becoming fundamental to prepare a new generation of programmers able to master the pervasive parallelism made available by hardware vendors. Class...
详细信息
ISBN:
(纸本)9781538649756
The ability to teach parallel programming principles and techniques is becoming fundamental to prepare a new generation of programmers able to master the pervasive parallelism made available by hardware vendors. Classical parallel programming courses leverage either low-level programming frameworks (e.g. those based on Pthreads) or higher level frameworks such as OpenMP or MPI. We discuss our teaching experience within the Master in "Computer Science and networking" where parallel programming is taught leveraging structured parallel programming principles and frameworks. The paper summarizes the results achieved in eight years of experience and shows how the adoption of a structured parallel programming approach improves the efficiency of the teaching process.
Benchmarking is a way to study the performance of new architectures and parallel programming frameworks. Well-established benchmark suites such as the NAS parallel Benchmarks (NPB) comprise legacy codes that still lac...
详细信息
Today powerful parallel computer architectures empower numerous application areas in personal computing and consumer electronics and parallel computation is an established mainstay in personal mobile devices (PMD). Du...
详细信息
Today powerful parallel computer architectures empower numerous application areas in personal computing and consumer electronics and parallel computation is an established mainstay in personal mobile devices (PMD). During last ten years PMDs have been equipped with increasingly powerful parallel computation architectures (CPU+GPU) enabling rich gaming, photography and multimedia experiences and general purpose parallel computation through application programming interfaces such as OpenGL ES and Apple Metal. Using a narrative literature review this study viewed into current status of parallel computing and parallel programming and specifically its application and practices of digital image processing applied in the domain of Mobile Systems (MS) and Personal Mobile Devices (PMD). While the research on the context is an emerging topic, there still is a limited amount of research available on the topic. As acknowledged in the literature and in the practice, the OpenGL ES programming model for computing tasks can be a challenging environment for many programmers. With OpenGL ES, the paradigm shift from serial- to parallel programming, in addition to changes and challenges in used programming language and the tools supporting the development, can be a barrier for many programmers. In this thesis a Design Science Research (DSR) approach was applied to build an artefact – an image- and video processing application on Apple iOS software platform using OpenGL ES parallel programming model. An Open Source Software (OSS) parallel computing library GPUImage was applied in the implementation of the artefact filtering- and effects functionality. Using the library, the process of applying the parallel programming model was efficient and productive. The used library structures and functionality were effectively suppressing the complexity of OpenGL ES setup- and management programming and provided efficient filter structures for implementing image- and video filters and effects. The
parallel programming techniques have been prominently explored in various engineering applications as it provides a time efficient solution to the complex problems without affecting the accuracy. parallel programming ...
详细信息
OCaml is a multi-paradigm (functional, imperative, object-oriented) high level sequential language. Types are stati¬cally inferred by the compiler and the type system is expressive and strong. These features make...
详细信息
ISBN:
(纸本)9781538678800
OCaml is a multi-paradigm (functional, imperative, object-oriented) high level sequential language. Types are stati¬cally inferred by the compiler and the type system is expressive and strong. These features make OCaml a very productive language for developing efficient and safe programs. In this tutorial we present three frameworks for using OCaml to program scalable parallel architectures: BSML, Multi-ML and Spoc.
This paper explores parallel nondeterministic programming as an extension to the C programming language;it provides constructs for specifying code containing ambiguous choice as introduced by McCarthy. A translator to...
详细信息
ISBN:
(纸本)9781450369800
This paper explores parallel nondeterministic programming as an extension to the C programming language;it provides constructs for specifying code containing ambiguous choice as introduced by McCarthy. A translator to plain C code was implemented as an extension to the ableC language specification. Translation involves a transformation to continuation passing style, providing lazy choice by storing continuation closures in a separate task buffer. This exploration considers various search evaluation approaches and their impact on correctness and performance. Multiple search drivers were implemented, including single-threaded depth-first search, a combined breadth- and depth-first approach, as well as two approaches to parallelism. Several benchmark applications were created using the extension, including n-Queens, SAT, and triangle peg solitaire. The simplest parallel search driver, using independent threads, showed the best performance in most cases, providing a significant speedup over the sequential versions. Adding task sharing between threads showed similar or slightly improved performance.
暂无评论