版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Karlsruhe Institute of Technology Department of Informatics Postfach 6980 Karlsruhe76128 Germany
出 版 物:《ACM Transactions on Parallel Computing》 (ACM Trans. Parallel Comp.)
年 卷 期:2015年第2卷第3期
页 面:1–29页
核心收录:
基 金:This work was partly supported by the German Research Foundation (DFG) as part of the Transregional Collaborative Research Center "Invasive Computing" (SFB/TR 89). Authors’ address: P. Sanders, J. Speck, and R. Steffen, Karlsruhe Institute of Technology, Department of Informatics, Postfach 6980, 76128 Karlsruhe, Germany emails: {sanders, speck}@kit.edu. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies show this notice on the first page or initial screen of a display along with the full citation. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept., ACM, Inc., 2 Penn Plaza, Suite 701, New York, NY 10121-0701 USA, fax +1 (212) 869-0481, or permissions@acm.org. ©c 2015 ACM 2329-4949/2015/09-ART15 $15.00 DOI: http://dx.doi.org/10.1145/2809812
主 题:Parallel algorithms
摘 要:We present an algorithm for inversion of symmetric positive definite matrices that combines the practical requirement of an optimal number of arithmetic operations and the theoretical goal of a polylogarithmic critical path length. The algorithm reduces inversion to matrix multiplication. It uses Strassen s recursion scheme, but on the critical path it breaks the recursion early, switching to an asymptotically inefficient yet fast use of Newton s method. We also show that the algorithm is numerically stable. Overall, we get a candidate for a massively parallel algorithm that scales to exascale systems even on relatively small inputs. © 2015 ACM 2329-4949/2015/09-ART15 $15.00