The growing demand for semiconductor devices simulation poses a big challenge for large-scale electronic structure *** various methods,the linearly scaling three-dimensional fragment(LS3DF)method exhibits excellent sc...
详细信息
The growing demand for semiconductor devices simulation poses a big challenge for large-scale electronic structure *** various methods,the linearly scaling three-dimensional fragment(LS3DF)method exhibits excellent scalability in large-scale *** on algorithmic and system-level optimizations,we propose a highly scalable and highly efficient implementation of LS3DF on a domestic heterogeneous supercomputer equipped with *** terms of algorithmic optimizations,the original all-band conjugate gradient algorithm is refined to achieve faster convergence,and mixed precision computing is adopted to increase overall *** terms of system-level optimiza-tions,the original two-layer parallel structure is replaced by a coarse-grained parallel *** strategies such as multi-stream,kernel fusion,and redundant computation removal are proposed to increase further utilization of the com-putational power provided by the heterogeneous *** a result,our optimized LS3DF can scale to a 10-million sili-con atoms system,attaining a peak performance of 34.8 PFLOPS(21.2% of the peak).All the improvements can be adapt-ed to the next-generation supercomputers for larger simulations.
暂无评论