The locally optimal block preconditioned conjugate gradient (LOBPCG) algorithm is a popular approach for computing a few smallest eigenvalues and the corresponding eigenvectors of a large Hermitian positive definite m...
详细信息
The locally optimal block preconditioned conjugate gradient (LOBPCG) algorithm is a popular approach for computing a few smallest eigenvalues and the corresponding eigenvectors of a large Hermitian positive definite matrix A. In this work, we propose a mixedprecision variant of LOBPCG that uses a (sparse) Cholesky factorization of A computed in lower precision as the preconditioner. To further enhance performance, a mixedprecision orthogonalization strategy is proposed. To analyze the impact of reducing precision in the preconditioner on performance, we carry out a rounding error and convergence analysis of PINVIT, a simplified variant of LOBPCG. Our theoretical results predict and our numerical experiments confirm that the impact on convergence remains marginal. In practice, our mixedprecision LOBPCG algorithm typically reduces the computation time by a factor of 1.4-2.0 on both CPUs and GPUs.
We propose a mixedprecision Jacobi algorithm for computing the singular value decomposition (SVD) of a dense matrix. After appropriate preconditioning, the proposed algorithm computes the SVD in a lower precision as ...
详细信息
We propose a mixedprecision Jacobi algorithm for computing the singular value decomposition (SVD) of a dense matrix. After appropriate preconditioning, the proposed algorithm computes the SVD in a lower precision as an initial guess and then performs one-sided Jacobi rotations in the working precision as iterative refinement. By carefully transforming a lower precision solution to a higher precision one, our algorithm achieves about 2x speedup on the x86-64 architecture compared to the usual one-sided Jacobi SVD algorithm in LAPACK, without sacrificing the accuracy. CCS Concepts: center dot Mathematics of computing -> Computations on matrices
暂无评论