Purification of the density matrix methods should be employed when dealing with complex chemical systems containing many atoms. The running times for these methods scale linearly with the number of atoms if we conside...
详细信息
Purification of the density matrix methods should be employed when dealing with complex chemical systems containing many atoms. The running times for these methods scale linearly with the number of atoms if we consider the sparsity from the density matrix. Since the efficiency expected from those methods is closely tied to the underlying parallel implementations of the linear algebra operations (e.g., P-2 = P x P), we proposed a central processing unit (CPU) and graphics processing unit (GPU) parallel matrix-matrix multiplication in SVBR (symmetrical variable block row) format for energy calculations through the SP2 algorithm. This algorithm was inserted in MOPAC's MOZYME method, using the original LMO Fock matrix assembly, and the atomic integral calculation implemented on it. Correctness and performance tests show that the implemented SP2 is accurate and fast, as the GPU is able to achieve speedups up to 40 times for a water cluster system with 42,312 orbitals running in one NVIDIA K40 GPU card compared to the single-threaded version. The GPU-accelerated SP2 algorithm using the MOZYME LMO framework enables the calculations of semiempirical wavefunction with stricter SCF criteria for localized charged molecular systems, as well as the single-point energies of molecules with more than 100.000 LMO orbitals in less than 1 h.
Traditional algorithms for first-principles molecular dynamics (FPMD) simulations only gain a modest capability increase from current petascale computers, due to their O(N-3) complexity and their heavy use of global c...
详细信息
Traditional algorithms for first-principles molecular dynamics (FPMD) simulations only gain a modest capability increase from current petascale computers, due to their O(N-3) complexity and their heavy use of global communications. To address this issue, we are developing a truly scalable O(N) complexity FPMD algorithm, based on density functional theory (DFT), which avoids global communications. The computational model uses a general nonorthogonal orbital formulation for the DFT energy functional, which requires knowledge of selected elements of the inverse of the associated overlap matrix. We present a scalable algorithm for approximately computing selected entries of the inverse of the overlap matrix, based on an approximate inverse technique, by inverting local blocks corresponding to principal submatrices of the global overlap matrix. The new FPMD algorithm exploits sparsity and uses nearest neighbor communication to provide a computational scheme capable of extreme scalability. Accuracy is controlled by the mesh spacing of the finite difference discretization, the size of the localization regions in which the electronic orbitals are confined, and a cutoff beyond which the entries of the overlap matrix can be omitted when computing selected entries of its inverse. We demonstrate the algorithm's excellent parallel scaling for up to O(100K) atoms on O(100K) processors, with a wall-clock time of O(1) minute per molecular dynamics time step.
We discuss in this review recent progress, especially by our group, on linear scaling algorithms for electronic structure calculations with numerical atomic basis sets. The principles of the construction of numerical ...
详细信息
We discuss in this review recent progress, especially by our group, on linear scaling algorithms for electronic structure calculations with numerical atomic basis sets. The principles of the construction of numerical basis sets and the Hamiltonian are introduced first. Then we discuss how to solve the single-electron equation self-consistently, and how to obtain electronic properties via post-self-consistent-field processes in a linearscaling way. The linear response calculation with linearscaling is also introduced. Numerical implementation is emphasized, with some applications presented for demonstration purposes.
暂无评论