Since the middle of the 1990s, message passing libraries are the most used technology to implement parallel and distributed scientific applications. However, they may not be a solution efficient enough on exascale mac...
详细信息
ISBN:
(纸本)9781665432818
Since the middle of the 1990s, message passing libraries are the most used technology to implement parallel and distributed scientific applications. However, they may not be a solution efficient enough on exascale machines since scalability issues will appear due to the increase in computing resources. Task-based programming models can be used to avoid collective communications like reductions, broadcast, or gather by transforming them into multiple operations on tasks. Then, these operations can be scheduled by the programming scheduler to place the data and computations in a way that optimizes and reduces the data communications. These properties could help to solve some MPI and exascale computing challenges. The oil and gas applications could also benefit from task-based programming properties. We developed a simplified version of the Kirchhoff seismic pre-stack depth migration, a subsurface exploration application, to experiment with HPX, a task-based programming model as well and MPI and MPI+OpenMP. Then, we perform strong scaling and weak scaling experiments on Pangea, Total supercomputer. We also study the variation of the number of OpenMP threads per MPI process. We show that the current task-based programming model schedulers lack the capability to completely manage the memory used and are not efficient enough to reduce the data migrations.
Complex Computational Fluid Dynamics (CFD) flow simulations are extremely expensive in terms of CPU time and memory. In this study, parallel computing and grid adaptation techniques are employed to achieve high effici...
详细信息
ISBN:
(纸本)188084348X
Complex Computational Fluid Dynamics (CFD) flow simulations are extremely expensive in terms of CPU time and memory. In this study, parallel computing and grid adaptation techniques are employed to achieve high efficiency and accuracy in a new hybrid unstructured flow solver. Adaptive local grid refinement/coarsening causes the unequal distribution of workload among the processors at run time. A simple, effective repartition and remapping based dynamic load balancing scheme, named RARB, has been developed and integrated into the flow solver to solve the load imbalance problem. A modified Recursive Coordinate Bisection (RCB) partition algorithm is exploited to repartition the computational domain due to its simplicity and efficiency once the load imbalance is detected. Two heuristic rules have been used to facilitate remapping the newly partitioned sub-domains to the processors with less data communication cost. Task migration from overloaded processors to underloaded processors is handled in parallel by a multi-level granularity procedure. Experiments conducted on a cluster of PCs demonstrate high efficiency and accuracy of the flow solver to accomplish complex flow computations, and the effectiveness of RARB to handle the load imbalance in grid adaptations.
暂无评论