The trend towards computers with multiple processing units keeps going with no end in sight. Modern consumer computers come with 2 - 6 processing units. Programming methods have been unable to keep up with this fast d...
详细信息
ISBN:
(纸本)9780819491299
The trend towards computers with multiple processing units keeps going with no end in sight. Modern consumer computers come with 2 - 6 processing units. Programming methods have been unable to keep up with this fast development. In this paper we present a framework that uses a dataflow model for parallel processing: the Generic Parallel Rapid Development Toolkit, GePaRDT. This intuitive programming model eases the concurrent usage of many processing units without specialized knowledge about parallel programming methods and it's pitfalls.
Molecular Dynamics (MD) has been and continues to be a popular method of molecular simulation because it is easily parallelizable. Parallel programming has become less burdensome for the science community, and competi...
详细信息
Molecular Dynamics (MD) has been and continues to be a popular method of molecular simulation because it is easily parallelizable. Parallel programming has become less burdensome for the science community, and competition in MD algorithm development has given MD avant-garde positions in molecular, bio-systems, materials, and nano-systems simulation. In contrast, inherently serial Monte Carlo (MC) methods have been largely ignored in the recent advancements of parallel computing technology. The trend exists even though MC methods based on statistical mechanics principles are superior for studying thermodynamics properties such as entropy and free energy. In my dissertation I present a means of parallelizing MC molecular simulation such that in time the popularity of MC may be restored to that of MD. The Adaptive Tempering Monte Carlo method (ATMC) employs the Metropolis MC (MMC) sampling criterion; therefore, both ATMC and MMC are inherently serial algorithms. ATMC is a multicanonical ensemble algorithm that optimizes system configuration by searching for the most ordered state. This algorithm was developed by Dong and Blaisten-Barojas in 2006. My algorithm accelerates ATMC and MMC in a novel implementation exploiting state of the art parallel processing technology, namely NVIDIA® Compute Unified Device Architecture (CUDA™) Graphics Processing Units (GPUs). My implementation source code is written in CUDA C, NVIDIA's extension to the C programming language for parallel programming, and summarily compiled by NVCC, NVIDIA's CUDA version 4.0 C compiler. My CUDA GPU-accelerated implementation is verified against a 2010 study by Dai and Blaisten-Barojas of pyrrole oligomers (specifically, 12-Py chains), an interesting material for its application in artificial muscles, actuators, chemical remediation, among others. This previous study put forward a partially coarse-grained model potential for reduced pyrrole oligomers at the polypyrrole experimental density. I introduced
Mathematical optimization is a widespread method in order to improve, for instance, the efficiency of energy systems. A simulation approach based on partial differential equations can typically not be formulated as an...
详细信息
Mathematical optimization is a widespread method in order to improve, for instance, the efficiency of energy systems. A simulation approach based on partial differential equations can typically not be formulated as an optimization problem, thus requiring interfacing to an external optimization environment. This is, among others, also true for the programming language Modelica. Because of high computation time, such coupled approaches are often limited to small-scale optimization problems. Since simulation models tend to get more complex, simulation time and, in turn, associated optimization time rise significantly. To enable proper sampling of the search space, individual optimization runs need to be solved in acceptable times. This paper addresses the search for a proper optimization approach and tool to couple with Modelica/Dymola. The optimization is carried out on an exemplary power plant model from the ClaRa-Library using an evolutionary algorithm (SPEA2-based) with Ansys optiSlang. To verify and evaluate the results, a comparison with the standard Dymola optimization library is performed. Both parallelization and indirect optimization with surrogate models achieved a significant runtime reduction by a factor of up to 5.4. The use of meta models is particularly advantageous for repetitive optimization runs of the same optimization problem but may lead to deviations due to the calculated approximations.
作者:
Liu, ZhoudingLi, JiaNYU
Coll Arts & Sci New York NY 10003 USA Univ Sydney
Econ Arts & Social Sci City Rd Camperdown NSW 2006 Australia
A growing number of scalable and distributed methods are required to effectively simulate complicated events as computing needs in the research and industrial sectors keep growing. A novel approach for developing and ...
详细信息
A growing number of scalable and distributed methods are required to effectively simulate complicated events as computing needs in the research and industrial sectors keep growing. A novel approach for developing and accessing mathematically modeled methods in heterogeneous computing clusters is proposed in this study to meet this difficulty. The suggested methodology uses DRL based Parallel Computational model for the evaluation of Heterogenous computing clusters. The algorithms makes use of parallelization methods to split up the processing burden among several nodes, supporting the variety of topologies seen in contemporary computing clusters. Through the utilization of heterogeneous hardware parts such as CPUs, GPUs, and acceleration devices, the architecture seeks to maximize speed and minimize resource usage. To evaluate the effectiveness of the proposed approach, a comprehensive performance assessment is conducted. The evaluation encompasses scalability analysis, benchmarking, and comparisons against traditional homogeneous computing setups. The research investigates the impact of algorithm design choices on the efficiency and speed achieved in diverse computing environments.
暂无评论