Given an m x n sparsematrix A having n(a) nonzeros and a permutation matrix P, we consider the problem of finding (PA)T in a space-saving manner. Two algorithms, TRANSPERM1 and TRANSPERM2, are presented. Both algorit...
详细信息
Given an m x n sparsematrix A having n(a) nonzeros and a permutation matrix P, we consider the problem of finding (PA)T in a space-saving manner. Two algorithms, TRANSPERM1 and TRANSPERM2, are presented. Both algorithms make use of the in-place inversion of permutation vectors. To save even more space TRANSPERM1 forms a piecewise linear model of the nonzero distribution. The computational complexity of TRANSPERM2 is O(n(a), n, m). For a particularly treacherous nonzero distribution we derive an upper bound for the complexity of TRANSPERM1 of O(n(a), taun(a)n/n(l), m, n, n(l)), where n(l) is the number of intervals in the piecewise model and tau less-than-or-equal-to 1/2. These two algorithms were compared with Gustavson's efficient HALFPERM algorithm.3 We expected our algorithms to be slower than HALFPERM since they were written with the intention of saving space. In tests on 21 matrices, HALFPERM was on average 2.5 times faster than TRANSPERM1 and 1.6 times faster than TRANSPERM2. However, for these 21 matrices, TRANSPERM1's storage requirements were on average only 51% of HALFPERM's, and TRANSPERM2's requirements were only 67% of HALFPERM's.
Solving a sparse system of linear equations Ax = b is one of the most fundamental operations inside any circuit simulator. The equations/rows in the matrix A are often rearranged/permuted before factorization and appl...
详细信息
Solving a sparse system of linear equations Ax = b is one of the most fundamental operations inside any circuit simulator. The equations/rows in the matrix A are often rearranged/permuted before factorization and applying direct or iterative methods to obtain the solution. Permuting the rows of the matrix A so that the entries with large absolute values lie on the diagonal has several advantages like better numerical stability for direct methods (e.g.. Gaussian elimination) and faster convergence for indirect methods (such as the Jacobi method). Duff (2009) [3] has formulated this as a weighted bipartite matching problem (the MC64 algorithm). In this paper we improve the performance of the MC64 algorithm with a new labeling technique which improves the asymptotic complexity of updating dual variables from O(vertical bar V vertical bar + vertical bar E vertical bar) to O(vertical bar V vertical bar), where vertical bar V vertical bar is the order of the matrix A and vertical bar E vertical bar is the number of non-zeros. Experimental results from using the new algorithm, when benchmarked with both industry benchmarks and UFL sparsematrix collection, are very promising. Our algorithm is more than 60 times faster (than Duffs algorithm) for sparse matrices with at least a million non-zeros. (C) 2010 Elsevier B.V. All rights reserved.
High-performance computing systems continue to increase in size in the quest for ever higher performance. The resulting increased electronic component count, coupled with the decrease in feature sizes of the silicon m...
详细信息
High-performance computing systems continue to increase in size in the quest for ever higher performance. The resulting increased electronic component count, coupled with the decrease in feature sizes of the silicon manufacturing processes used to build these components, may result in future exascale systems being more susceptible to soft errors caused by cosmic radiation than in current high-performance computing systems. Through the use of techniques such as hardware-based error-correcting codes and checkpoint-restart, many of these faults can be mitigated at the cost of increased hardware overhead, run-time, and energy consumption that can be as much as 10-20%. Some predictions expect these overheads to continue to grow over time. For extreme scale systems, these overheads will represent megawatts of power consumption and millions of dollars of additional hardware costs, which could potentially be avoided with more sophisticated fault-tolerance techniques. In this paper we present new software-based fault tolerance techniques that can be applied to one of the most important classes of software in high-performance computing: iterative sparsematrix solvers. Our new techniques enables us to exploit knowledge of the structure of sparse matrices in such a way as to improve the performance, energy efficiency, and fault tolerance of the overall solution.
Many sparse matrix algorithms-for example, solving a sparse system of linear equations-begin by predicting the nonzero structure of the output of a matrix computation from the nonzero structure of its input. This pape...
详细信息
Many sparse matrix algorithms-for example, solving a sparse system of linear equations-begin by predicting the nonzero structure of the output of a matrix computation from the nonzero structure of its input. This paper is a catalog of ways to predict nonzero structure. It contains known results for some problems, including various matrix factorizations, and new results for other problems, including some eigenvector computations.
High-performance computing (HPC) systems continue to increase in size in the quest for ever higher performance. The resulting increased electronic component count, coupled with the decrease in feature sizes of the sil...
详细信息
ISBN:
(纸本)9781467365987
High-performance computing (HPC) systems continue to increase in size in the quest for ever higher performance. The resulting increased electronic component count, coupled with the decrease in feature sizes of the silicon manufacturing processes used to build these components, will result in future Exascale systems being more susceptible to soft errors caused by cosmic radiation than current HPC systems. Through the use of techniques such as hardware-based error correcting codes (ECC) and checkpoint-restart, many of these faults can be mitigated, but at the cost of increased hardware overhead, run-time, and energy consumption that can be as much as 10-20%. For extreme scale systems, these overheads will represent megawatts of power consumption and millions of dollars of additional hardware cost, which could potentially be avoided with more sophisticated fault-tolerance techniques. In this paper we present a new software-based fault tolerance technique that can be applied to one of the most important classes of software in HPC: sparsematrix solvers. Our new technique enables us to exploit knowledge of the structure of sparse matrices in such a way as to improve the performance, energy efficiency and fault tolerance of the overall solution.
SPEX Left LU is a software package for exactly solving unsymmetric sparse linear systems. As a component of the sparse exact (SPEX) software package, SPEX Left LU can be applied to any input matrix, A, whose entries a...
详细信息
SPEX Left LU is a software package for exactly solving unsymmetric sparse linear systems. As a component of the sparse exact (SPEX) software package, SPEX Left LU can be applied to any input matrix, A, whose entries are integral, rational, or decimal, and provides a solution to the system Ax = b, which is either exact or accurate to user-specified precision. SPEX Left LU preorders the matrix A with a user-specified fill-reducing ordering and computes a left-looking LU factorization with the special property that each operation used to compute the L and U matrices is integral. Notable additional applications of this package include benchmarking the stability and accuracy of state-of-the-art linear solvers and determining whether singular-to-double-precision matrices are indeed singular. Computationally, this article evaluates the impact of several novel pivoting schemes in exact arithmetic, benchmarks the exact iterative solvers within Linbox, and benchmarks the accuracy of MATLAB sparse backslash. Most importantly, it is shown that SPEX Left LU outperforms the exact iterative solvers in run time on easy instances and in stability as the iterative solver fails on a sizeable subset of the tested (both easy and hard) instances. The SPEX Left LU package is written in ANSI C, comes with a MATLAB interface, and is distributed via GitHub, as a component of the SPEX software package, and as a component of Suitesparse.
The roundoff-error-free (REF) LU factorization, along with the REF forward and backward substitution algorithms, allows a rational system of linear equations to be solved exactly and efficiently. The REF LU factorizat...
详细信息
The roundoff-error-free (REF) LU factorization, along with the REF forward and backward substitution algorithms, allows a rational system of linear equations to be solved exactly and efficiently. The REF LU factorization framework has two key properties: all operations are integral, and the size of each entry is bounded polynomially-a bound that rational arithmetic Gaussian elimination achieves only via the use of computationally expensive greatest common divisor operations. This paper develops a sparse version of REF LU, termed the sparse Left-looking Integer-Preserving (SLIP) LU factorization, which exploits sparsity while maintaining integrality of all operations. In addition, this paper derives a tighter polynomial bound on the size of entries in L and U and shows that the time complexity of SLIP LU is proportional to the cost of the arithmetic work performed. Last, SLIP LU is shown to significantly outperform a modern full-precision rational arithmetic LU factorization approach on a set of real world instances. In all, SLIP LU is a framework to efficiently and exactly solve sparse linear systems.
Exactly solving sparse symmetric positive definite (SPD) linear systems is a key problem in mathematics, engineering, and computer science. This paper derives two new sparse roundoff-error-free (REF) Cholesky factoriz...
详细信息
Exactly solving sparse symmetric positive definite (SPD) linear systems is a key problem in mathematics, engineering, and computer science. This paper derives two new sparse roundoff-error-free (REF) Cholesky factorization algorithms which exactly solve sparse SPD linear systems Ax = b, where A is an element of Qnxn and x,b is an element of Qnxp. The key properties of these factorizations are that (1) they exclusively use integer-arithmetic and (2) in the bit-complexity model, they solve the linear system Ax = b in time proportional to the cost of the integer-arithmetic operations. Namely, the overhead related to data structures and ancillary operations (those not strictly required to perform the factorization) is subsumed by the cost of the integer-arithmetic operations that are essential/intrinsic to the factorization. Notably, to date our algorithms are the only exact algorithm for solving SPD linear systems with this asymptotically efficient complexity bound. Computation-ally, we show that the novel factorizations are faster than both sparse rational-arithmetic LDL and sparse exact LU factorization. Altogether, the derived sparse REF Cholesky factorizations present a framework to solve any rational SPD linear system exactly and efficiently.
SPEX Cholesky, SPEX LDL, and SPEX Backslash are software packages for exactly solving sparse linear systems, Ax = b . SPEX Cholesky, used for symmetric positive definite (SPD) systems, computes an integral Cholesky fa...
详细信息
SPEX Cholesky, SPEX LDL, and SPEX Backslash are software packages for exactly solving sparse linear systems, Ax = b . SPEX Cholesky, used for symmetric positive definite (SPD) systems, computes an integral Cholesky factorization to solve the system Ax = b in time proportional to arithmetic work-to date the only algorithm for SPD linear systems with this property. SPEX LDL extends SPEX Cholesky for symmetric negative definite and symmetric indefinite matrices with exclusively non-zero leading principal minors. SPEX Backslash is a general-purpose exact solver that automatically determines the best ordering and factorization to exactly solve the system Ax = b . Computationally, we test the accuracy of MATLAB sparse backslash, the state-of-the-art collection of sparsematrix solvers, revealing it is near perfect for 87% of the tested instances. In addition, we show that SPEX Cholesky outperforms alternate exact solvers in runtime;specifically, SPEX Cholesky outperforms the exact solver Linbox and exact LU factorization on 70% and 92% of tested instances, respectively. Each of SPEX Cholesky, SPEX LDL, and SPEX Backslash is implemented in C and is accompanied by easy-to-use Python and MATLAB interfaces. They are distributed via GitHub, as a component of the SPEX software package, and as component of Suitesparse.
Applying a permuted diagonal similarity transform DPAP(T)D(-1) to a matrix A before calculating its eigenvalues can improve the speed and accuracy with which the eigenvalues are computed. This is often called balancin...
详细信息
Applying a permuted diagonal similarity transform DPAP(T)D(-1) to a matrix A before calculating its eigenvalues can improve the speed and accuracy with which the eigenvalues are computed. This is often called balancing. This paper describes several balancing algorithms for sparse matrices and compares them against each other and the traditional dense algorithm. We first discuss our sparse implementation of the dense algorithm;our code is faster than the dense algorithm when the density of the matrix is no more than approximately .5, and is much faster for large, sparse matrices. We next describe a set of randomized balancing algorithms for matrices that are not given explicitly, i.e. given a vector x, we can compute only Ax and perhaps A(T)x. We motivate these Krylov-based algorithms using Perron-Frobenius theory. Results are given comparing the Krylov-based algorithms to each other and to the sparse and dense direct balancing algorithms, looking at norm reduction, running times, and the accuracy of eigenvalues computed after a matrix is balanced. We conclude that sparse balancing algorithms are efficient preconditioners for eigensolvers. (C) 2000 Elsevier Science Inc. All rights reserved.
暂无评论