New computational techniques for simulating a large array of wind turbines are highly needed to model modern electrical grid networks. In this paper, an implementation of a doubly fed induction generator wind turbine ...
详细信息
New computational techniques for simulating a large array of wind turbines are highly needed to model modern electrical grid networks. In this paper, an implementation of a doubly fed induction generator wind turbine model solver is proposed. This solver will run on an NVIDIA graphic processing unit, and it will be coded using the compute unified device architecture (CUDA). The implementation will integrate a linear time-invariant system represented by state-space matrices. It has been implemented a CUDA kernel capable of simulating many wind turbines in parallel with different wind profiles and using different configurations. Strategies such as optimizing memory access and overlapping data transfers with the kernel were used to obtain the results. The CUDA implementation reaches an occupancy of 95%, while simulating 500 wind turbines where each unit is subject to a different wind profile or using different configuration parameters.
Application of parallel programming methods for simulating the impact of polymer dispersed systems on oil reservoirs on a hybrid computer system that uses the central processor cores along with the graphics processing...
详细信息
Application of parallel programming methods for simulating the impact of polymer dispersed systems on oil reservoirs on a hybrid computer system that uses the central processor cores along with the graphics processing unit is discussed. The efficiency of the proposed approach for solving practical problems of simulating waterflooding of oil reservoirs using polymer dispersed systems on computers with hybrid architecture is demonstrated.
One of the core mechanisms involved in the control of saccade responses to selected target stimuli is the disengagement from the current fixation location, so that the next saccade can be executed. To carry out everyd...
详细信息
One of the core mechanisms involved in the control of saccade responses to selected target stimuli is the disengagement from the current fixation location, so that the next saccade can be executed. To carry out everyday visual tasks, we make multiple eye movements that can be programmed in parallel. However, the role of disengagement in the parallel programming of saccades has not been examined. It is well established that the need for disengagement slows down saccadic response time. This may be important in allowing the system to program accurate eye movements and have a role to play in the control of multiple eye movements but as yet this remains untested. Here, we report two experiments that seek to examine whether fixation disengagement reduces saccade latencies when the task completion demands multiple saccade responses. A saccade contingent paradigm was employed and participants were asked to execute saccadic eye movements to a series of seven targets while manipulating when these targets were shown. This both promotes fixation disengagement and controls the extent that parallel programming can occur. We found that trial duration decreased as more targets were made available prior to fixation: this was a result both of a reduction in the number of saccades being executed and in their saccade latencies. This supports the view that even when fixation disengagement is not required, parallel programming of multiple sequential saccadic eye movements is still present. By comparison with previous published data, we demonstrate a substantial speeded of response times in these condition ("a gap effect") and that parallel programming is attenuated in these conditions.
This paper provides a review of contemporary methodologies and APIs for parallel programming, with representative technologies selected in terms of target system type (shared memory, distributed, and hybrid), communic...
详细信息
This paper provides a review of contemporary methodologies and APIs for parallel programming, with representative technologies selected in terms of target system type (shared memory, distributed, and hybrid), communication patterns (one-sided and two-sided), and programming abstraction level. We analyze representatives in terms of many aspects including programming model, languages, supported platforms, license, optimization goals, ease of programming, debugging, deployment, portability, level of parallelism, constructs enabling parallelism and synchronization, features introduced in recent versions indicating trends, support for hybridity in parallel execution, and disadvantages. Such detailed analysis has led us to the identification of trends in high-performance computing and of the challenges to be addressed in the near future. It can help to shape future versions of programming standards, select technologies best matching programmers' needs, and avoid potential difficulties while using high-performance computing systems.
This paper proposes a technology for large biomedical data analyzing based on CUDA computation. The technology was used to analyze a large set of fundus images used for diabetic retinopathy automatic diagnostics. A hi...
详细信息
ISBN:
(纸本)9781728152585
This paper proposes a technology for large biomedical data analyzing based on CUDA computation. The technology was used to analyze a large set of fundus images used for diabetic retinopathy automatic diagnostics. A high-performance algorithm has been developed to calculate effective textural characteristics for medical image analysis. During the automatic image diagnostics, the following classes were distinguished: thin vessels, thick vessels, exudates and healthy areas. The mentioned algorithm's efficiency study was conducted with 500x500-1000x1000 pixels images using a 12x12 dimension window. The relationship between the developed algorithm's acceleration and data sizes was demonstrated. The study showed that the algorithm effectiveness can be depends of certain characteristics of the image, as its clarity, the shape of exudate zone, the variability of blood vessels, and the optic disc's location.
High-level programming languages such as Python are increasingly used to provide intuitive interfaces to libraries written in lower-level languages and for assembling applications from various components. This migrati...
详细信息
ISBN:
(纸本)9781450366700
High-level programming languages such as Python are increasingly used to provide intuitive interfaces to libraries written in lower-level languages and for assembling applications from various components. This migration towards orchestration rather than implementation, coupled with the growing need for parallel computing (e.g., due to big data and the end of Moore's law), necessitates rethinking how parallelism is expressed in programs. Here, we present Parsl, a parallel scripting library that augments Python with simple, scalable, and flexible constructs for encoding parallelism. These constructs allow Parsl to construct a dynamic dependency graph of components that it can then execute efficiently on one or many processors. Parsl is designed for scalability, with an extensible set of executors tailored to different use cases, such as low-latency, high-throughput, or extreme-scale execution. We show, via experiments on the Blue Waters supercomputer, that Parsl executors can allow Python scripts to execute components with as little as 5 ms of overhead, scale to more than 250 000 workers across more than 8000 nodes, and process upward of 1200 tasks per second. Other Parsl features simplify the construction and execution of composite programs by supporting elastic provisioning and scaling of infrastructure, fault-tolerant execution, and integrated wide-area data management. We show that these capabilities satisfy the needs of many-task, interactive, online, and machine learning applications in fields such as biology, cosmology, and materials science.
In this tutorial participants learn how to build their own parallel programming language features by developing them as language extensions in the ableC [4] extensible C compiler framework. By implementing new paralle...
详细信息
ISBN:
(纸本)9781450362252
In this tutorial participants learn how to build their own parallel programming language features by developing them as language extensions in the ableC [4] extensible C compiler framework. By implementing new parallel programming abstractions as language extensions one can build on an existing host language and thus avoid re-implementing common language features such as the type checking and code generation of arithmetic expressions and control flow statements. Using ableC, one can build expressive language features that fit seamlessly into the C11 host language.
Concurrent and parallel programming (CPP) skills are increasingly important in today's world of parallel hardware. However, the conceptual leap from deterministic sequential programming to CPP is notoriously chall...
详细信息
ISBN:
(纸本)9781450372176
Concurrent and parallel programming (CPP) skills are increasingly important in today's world of parallel hardware. However, the conceptual leap from deterministic sequential programming to CPP is notoriously challenging to make. Our educational game parallel is designed to support the learning of CPP core concepts through a game-based learning approach, focusing on the connection between gameplay and CPP. Through a 10-week user study (n 25) in an undergraduate concurrent programming course, the first empirical study for a CPP educational game, our results show that parallel offers both CPP knowledge and student engagement. Furthermore, we provide a new framework to describe the design space for programming games in general.
The stream processing paradigm is used in several scientific and enterprise applications in order to continuously compute results out of data items coming from data sources such as sensors. The full exploitation of th...
详细信息
ISBN:
(纸本)9781538655559
The stream processing paradigm is used in several scientific and enterprise applications in order to continuously compute results out of data items coming from data sources such as sensors. The full exploitation of the potential parallelism offered by current heterogeneous multi-cores equipped with one or more GPUs is still a challenge in the context of stream processing applications. In this work, our main goal is to present the parallel programming challenges that the programmer has to face when exploiting CPUs and GPUs' parallelism at the same time using traditional programming models. We highlight the parallelization methodology in two use-cases (the Mandelbrot Streaming benchmark and the PARSEC's Dedup application) to demonstrate the issues and benefits of using heterogeneous parallel hardware. The experiments conducted demonstrate how a high-level parallel programming model targeting stream processing like the one offered by SPar can be used to reduce the programming effort still offering a good level of performance if compared with state-of-the-art programming models.
暂无评论