The embarrassinglyparallel (EP) algorithm which is typical of many Monte Carlo applications provides an estimate of the upper achievable limits for double precision performance of parallel supercomputers. Recently, I...
详细信息
ISBN:
(纸本)9781479920303
The embarrassinglyparallel (EP) algorithm which is typical of many Monte Carlo applications provides an estimate of the upper achievable limits for double precision performance of parallel supercomputers. Recently, Intel released Many Integrated Core (MIC) architecture as a many-core co-processor. MIC often offers more than 50 cores each of which can run four hardware threads as well as 512-bit vector instructions. In this paper, we describe how the EP algorithm is accelerated effectively on the platforms containing MIC using the offload execution model. The result shows that the efficient implementation of EP algorithm on MIC can take full advantage of MIC's computational resources and achieves a speedup of 3.06 compared with that on Intel Xeon E5-2670 CPU. Based on the EP algorithm on MIC and an effective task distribution model, the implementation of EP algorithm on a CPU-MIC heterogeneous platform achieves the performance of up to 2134.86 Mop/s and 4.04 times speedup compared with that on Intel Xeon E5-2670 CPU.
The embarrassinglyparallel(EP) algorithm which is typical of many Monte Carloapplications provides an estimate of the upper achievable limits for double precision performance of parallel supercomputers. Recently, Int...
详细信息
The embarrassinglyparallel(EP) algorithm which is typical of many Monte Carloapplications provides an estimate of the upper achievable limits for double precision performance of parallel supercomputers. Recently, Intel released Many Integrated Core(MIC) architecture as a many-core co-processor. MIC often offers more than 50 cores each of which can run four hardware threads as well as 512-bit vector instructions. In this paper,we describe how the EP algorithm is accelerated effectively on the platforms containing MIC using the offload execution model. The result shows that the efficientimplementation of EP algorithm on MIC can take full advantage of MIC's computational resources and achieves a speedup of 3.06 compared with that on Intel Xeon E5-2670 CPU. Based on the EP algorithm on MIC and an effective task distribution model, the implementation of EP algorithm on a CPU-MIC heterogeneous platform achieves the performance of up to2134.86 Mop/s and 4.04 times speedup compared with that on Intel Xeon E5-2670 CPU.
暂无评论