Development of efficient numerical programs for large distributed parallel computers is a challenging problem. Many programming languages, systems and libraries exist and evolve to help with it, yet the problem is far...
详细信息
Development of efficient numerical programs for large distributed parallel computers is a challenging problem. Many programming languages, systems and libraries exist and evolve to help with it, yet the problem is far from being solved. Of interest are particular application implementations' studies, which reveal actual capabilities of a system in the real computation. In this paper, the implementation of an indicative 3D model heat equation parallel solver using fragmented programming technology and LuNAsystem is investigated. A comparative testingwith conventional MPI implementation is presented. The pros and cons of the approach are analyzed for corresponding applications class.
The paper is devoted to the problem of reduction of complexity of development of numerical parallel programs for distributed memory computers with hybrid (CPU+GPU) computing nodes. The basic idea is to employ a high-l...
详细信息
ISBN:
(纸本)9783319629322;9783319629315
The paper is devoted to the problem of reduction of complexity of development of numerical parallel programs for distributed memory computers with hybrid (CPU+GPU) computing nodes. The basic idea is to employ a high-level representation of an application algorithm to allow its automated execution on multicomputers with hybrid nodes without a programmer having to do low-level programming. LuNA is a programming system for numerical algorithms, which implements the idea, but only for CPU. In the paper we propose a LuNA language extension, as well as necessary run-time algorithms to support GPU utilization. For that a user only has to provide a limited number of computational GPU procedures using CUDA, while the system will take care of such associated low-level problems, as jobs scheduling, CPU-GPU data transfer, network communications and others. The algorithms developed and implemented take advantage of concerning informational dependencies of an application and support automated tuning to available hardware configuration and application input data.
暂无评论