Simulation of in vivo cellular processes with the reaction-diffusion master equation (RDME) is a computationally expensive task. Our previous software enabled simulation of inhomogeneous biochemical systems for small ...
详细信息
Simulation of in vivo cellular processes with the reaction-diffusion master equation (RDME) is a computationally expensive task. Our previous software enabled simulation of inhomogeneous biochemical systems for small bacteria over long time scales using the MPD-RDME method on a single GPU. Simulations of larger eukaryotic systems exceed the on-board memory capacity of individual GPUs, and long time simulations of modest-sized cells such as yeast are impractical on a single GPU. We present a new multi-GPU parallel implementation of the MPD-RDME method based on a spatial decomposition approach that supports dynamic load balancing for workstations containing GPUs of varying performance and memory capacity. We take advantage of high-performance features of CUDA for peer-to-peer GPU memory transfers and evaluate the performance of our algorithms on state-of-the-art GPU devices. We present parallel efficiency and performance results for simulations using multiple GPUs as system size, particle counts, and number of reactions grow. We also demonstrate multi-GPU performance in simulations of the Min protein system in E. coli. Moreover, our multi-GPU decomposition and load balancing approach can be generalized to other lattice-based problems. (C) 2014 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://***/licenses/by-nc-nd/3.0/).
This paper outlines two approaches to introducing parallelcomputing to the R statistical computing environment. The first, approach is based on implicitly parallelizing basic R operations, such as vectorized arithmet...
详细信息
ISBN:
(纸本)9783790820836
This paper outlines two approaches to introducing parallelcomputing to the R statistical computing environment. The first, approach is based on implicitly parallelizing basic R operations, such as vectorized arithmetic operations;this is suitable for taking advantage of multi-core processors with shared memory. The second approach is based on developing a small set of explicit parallel computation directives and is most useful in a distributedmemory framework.
The HIRLAM (high resolution limited area modelling) limited-area atmospheric model was originally developed and optimized for shared memory vector-based computers, and has been used for operational weather forecasting...
详细信息
The HIRLAM (high resolution limited area modelling) limited-area atmospheric model was originally developed and optimized for shared memory vector-based computers, and has been used for operational weather forecasting on such machines for several years. This paper describes the algorithms applied to obtain a highly parallel implementation of the model, suitable for distributedmemory machines. The performance results presented indicate that the parallelization effort has been successful, and the Norwegian Meteorological Institute will run the parallel version in production on a Gray T3E. (C) 1997 Elsevier Science B.V.
暂无评论