The prevalence of convolution in applications within signal processing, deep neural networks, and numerical solvers has motivated the development of numerous fast convolution algorithms. In many of these problems, con...
详细信息
The prevalence of convolution in applications within signal processing, deep neural networks, and numerical solvers has motivated the development of numerous fast convolution algorithms. In many of these problems, convolution is performed on terabytes or petabyte.s of data, so even constant factors of improvement can significantly reduce the computation time. We leverage the formalism of bilinear algorithms to describe and analyze all of the most popular approaches. This unified lens permits us to study the relationship between different variants of convolution as well as to derive error bounds and analyze the cost of the various algorithms. We provide new derivations, which predominantly leverage matrix and tensor algebra, to describe the Winograd family of convolution algorithms as well as reductions between ID and multidimensional convolution. We provide cost and error bounds as well as experimental numerical studies. Our experiments for two of these algorithms, the overlap-add approach and Winograd convolution algorithm with polynomials of degree greater than one, show that fast convolution algorithms can rival the accuracy of the fast Fourier transform without using complex arithmetic. These algorithms can be used for convolution problems with multidimensional inputs or for filters larger than size of four, extending the state of the art in Winograd-based convolution algorithms.
We introduce a new theoretical framework for deriving lower bounds on data movement in bilinear algorithms. bilinear algorithms are a general representation of fast algorithms for bilinear functions, which include com...
详细信息
We introduce a new theoretical framework for deriving lower bounds on data movement in bilinear algorithms. bilinear algorithms are a general representation of fast algorithms for bilinear functions, which include computation of matrix multiplication, convolution, and symmetric tensor contractions. A bilinear algorithm is described by three matrices. Our communication lower bounds are based on quantifying the minimal matrix ranks of matching subsets of columns of these matrices. This infrastructure yields new lower bounds for symmetric tensor contraction algorithms, which provide new qualitative insights. Tensor symmetry (invariance under permutation of modes) is common to many applications of tensor computations (e.g., tensor representation of hypergraphs, analysis of high-order moments in data, as well as tensors modeling interactions of electrons in computational chemistry). Tensor symmetry enables reduction in representation size as well as contraction cost by factors that scale with the number of equivalent permutations. However, we derive lower bounds showing that these cost and memory reductions can necessitate increases in data movement by factors that scale with the size of the tensors.
It is shown that the approximate bilinear complexity of multiplying matrices of the order 2 × 2 by a matrix of the order 2 × 6 does not exceed 19. An approximate bilinear algorithm of complexity 19 is presen...
详细信息
algorithms are the art of efficient computation: it is by the power of algorithms that solving problems becomes feasible, and that we may harness the power of computing machinery. Efficient algorithms t...
详细信息
algorithms are the art of efficient computation: it is by the power of algorithms that solving problems becomes feasible, and that we may harness the power of computing machinery. Efficient algorithms translate directly to savings in resources, such as time, storage space, and electricity, and thus money. With the end of the exponential increase in the computational power of hardware, the value of efficient algorithms may be greater than ever.
This thesis presents advancements in multiple fields of algorithms, related through the application of bilinear techniques. Functions that map elements from a pair of vector spaces to a third vector space with the property that they are linear in their arguments, or bilinear maps, are a ubiquitous and fundamental mathematical tool, the canonical example being the matrix multiplication. We address both the applications that make use of bilinear maps and the computation of the bilinear maps itself, Boolean matrix multiplication in particular.
In the field of similarity search, we improve on Valiant’s randomized algorithm [FOCS 2012; J. ACM 2015] for finding correlated vectors by (i) presenting an improved sampling scheme that enables faster processing by using fast matrix multiplication, and (ii) derandomizing Valiant’s algorithm. These results are mostly of theoretical nature since they rely on fast matrix multiplication.
We also present (iii) an adaptive prefix-assignment method for symmetry breaking. An instantiation of McKay’s canonical extension framework [J. algorithms 1998], the method produces a set of partial assignments with respect to a sequence of a prefix of variables in a system of constraints, such that all generated assignments are pairwise nonisomorphic. The method breaks the symmetries completely with respect to the prefix sequence, and can benefit from an auxiliary representation of symmetries in the form of a colored graph. We also provide an implementation that works as a preprocessor for Boolean satisfiabil
Let mu(q2)(n, k)denote the minimum number of multiplications required to compute the coefficients of the product of two degree nk - 1 polynomials modulo the kth power of an irreducible polynomial of degree n over the ...
详细信息
Let mu(q2)(n, k)denote the minimum number of multiplications required to compute the coefficients of the product of two degree nk - 1 polynomials modulo the kth power of an irreducible polynomial of degree n over the q(2) element field F-q2. It is shown that for all odd q and all n = 1,2,..., lim inf(k ->infinity) mu(q2)(n, k)/kn <= 2(1 + 1/q - 2). For the proof of this upper bound, we show that for an odd prime power q, all algebraic function fields in the Garcia-Stichtenoth tower F-q2 over have places of all degrees and apply a Chudnovsky like algorithm for multiplication of polynomials modulo a power of an irreducible polynomial.
First we study asymptotically fast algorithms for rectangular matrix multiplication. We begin with new algorithms for multiplication of an n x n matrix by an n x n(2) matrix in arithmetic time O(n(omega)), omega = 3.3...
详细信息
First we study asymptotically fast algorithms for rectangular matrix multiplication. We begin with new algorithms for multiplication of an n x n matrix by an n x n(2) matrix in arithmetic time O(n(omega)), omega = 3.333953..., which is less by 0.041 than the previous record 3.375477.... Then we present fast multiplication algorithms for matrix pairs of arbitrary dimensions, estimate the asymptotic running time as a function of the dimensions, and optimize the exponents of the complexity estimates. For a large class of input matrix pairs, we improve the known exponents. Finally we show three applications of our results: (a) we decrease from 2.851 to 2.837 the known exponent of *** bounds for fast deterministic (NC) parallel evaluation of the determinant, the characteristic polynomial, and the inverse of an n x n matrix, as well as for the solution to a nonsingular linear system of n equations, (b) we asymptotically accelerate the known sequential algorithms for the univariate polynomial composition mod x(n), yielding the complexity bound O(n(1.667)) versus the old record of O(n(1.688)), and for the univariate polynomial factorization over a finite field, and (c) we improve slightly the known complexity estimates for computing basic solutions to the linear programming problem with n constraints and n variables. (C) 1998 Academic Press.
Strassen's algorithm (1969) was the first sub-cubic matrix multiplication algorithm. Winograd (1971) improved its complexity by a constant factor. Many asymptotic improvements followed. Unfortunately, most of them...
详细信息
ISBN:
(纸本)9781450345934
Strassen's algorithm (1969) was the first sub-cubic matrix multiplication algorithm. Winograd (1971) improved its complexity by a constant factor. Many asymptotic improvements followed. Unfortunately, most of them have done so at the cost of very large, often gigantic, hidden constants. Consequently, Strassen-Winograd's O(n(log27)) algorithm often outperforms other matrix multiplication algorithms for all feasible matrix dimensions. The leading coefficient of Strassen-Winograd's algorithm was believed to be optimal for matrix multiplication algorithms with 2 x 2 base case, due to a lower bound of Probert (1976). Surprisingly, we obtain a faster matrix multiplication algorithm, with the same base case size and asymptotic complexity as Strassen-Winograd's algorithm, but with the coefficient reduced from 6 to 5. To this end, we extend Bodrato's (2010) method for matrix squaring, and transform matrices to an alternative basis. We prove a generalization of Probert's lower bound that holds under change of basis, showing that for matrix multiplication algorithms with a 2 x 2 base case, the leading coefficient of our algorithm cannot be further reduced, hence optimal. We apply our technique to other Strassen-like algorithms, improving their arithmetic and communication costs by significant constant factors.
暂无评论