The intersection of sorted arrays problem has applications in search engines such as Google. Previous work has proposed and compared deterministic algorithms for this problem, in an adaptive analysis based on the enco...
详细信息
The intersection of sorted arrays problem has applications in search engines such as Google. Previous work has proposed and compared deterministic algorithms for this problem, in an adaptive analysis based on the encoding size of a certificate of the result (cost analysis). We define the alternation analysis, based on the nondeterministic complexity of an instance. In this analysis we prove that there is a deterministic algorithm asymptotically performing as well as any randomized algorithm in the comparison model. We define the redundancy analysis, based on a measure of the internal redundancy of the instance. In this analysis we prove that any algorithm optimal in the redundancy analysis is optimal in the alternation analysis, but that there is a randomized algorithm which performs strictly better than any deterministic algorithm in the comparison model. Finally, we describe how these results can be extended beyond the comparison model.
In this paper, we propose an O(N log N) hierarchical random compression method (HRCM) for kernel matrix compressing, which only requires sampling O(N log N) entries of a matrix. The HRCM combines the hierarchical fram...
详细信息
In this paper, we propose an O(N log N) hierarchical random compression method (HRCM) for kernel matrix compressing, which only requires sampling O(N log N) entries of a matrix. The HRCM combines the hierarchical framework of the H-matrix and a randomized sampling technique of column and row spaces for far-field interaction kernel matrix. We show that a uniform column/row sampling of a far-field kernel matrix, thus without the need and associated cost to pre-compute a costly sampling distribution, will give a low-rank compression of such low-rank matrix, independent of the matrix size and only dependent on the separation of the source and target locations. This far-field random compression technique is then implemented at each level of the hierarchical decomposition for general kernel matrices, resulting in an O(N log N) random compression method. Error and complexity analysis for the HRCM are included. Numerical results for electrostatic and low frequency Helmholtz wave kernels have validated the efficiency and accuracy of the proposed method in comparison of direct O(N-2) summations. (C) 2019 Elsevier Inc. All rights reserved.
Kernel functions play a pivotal role in a wide range of scientific computing and machine learning problems, but they ofter result in dense kernel matrices that impose great challenges in computational costs at large s...
详细信息
Kernel functions play a pivotal role in a wide range of scientific computing and machine learning problems, but they ofter result in dense kernel matrices that impose great challenges in computational costs at large scale. To address this issue, we develop a set of fast kernel matrix compressing algorithms, which can reduce computation cost of matrix operations in the related applications. The foundation of these algorithms is the polyharmonic spline interpolation, which encompass a set of radial basis functions that allow flexible choices of interpolating nodes, and a set of polynomial basis functions that guarantee the solvability and convergence of the interpolation. With these properties, original data points in the interacting kernel function can be randomly sampled with great flexibility, so the proposed method is suitable for complicated data structures, such as high -dimensionality, random distribution, or manifold. To further boost the algorithm accuracy and efficiency, our scheme incorporates a QR sampling strategy, and combined with a recently developed fast stochastic SVD to form a hybrid method. If the overall number of degree of freedom is N, then the compressing algorithm has complexity of O (N) for low -rank matrices, and O (N log N) for general matrices with a hierarchical structure. Numerical results for data on various domains and different kernel functions validate the accuracy and efficiency of the proposed method.(c) 2023 Elsevier Inc. All rights reserved.
algorithms on multivariate polynomials represented by straight-line programs are developed. First, it is shown that most algebraic algorithms can be probabilistically applied to data that are given by a straight-line ...
详细信息
algorithms on multivariate polynomials represented by straight-line programs are developed. First, it is shown that most algebraic algorithms can be probabilistically applied to data that are given by a straight-line computation. Testing such rational numeric data for zero, for instance, is facilitated by random evaluations modulo random prime numbers. Then, auxiliary algorithms that determine the coefficients of a multivariate polynomial in a single variable are constructed. The first main result is an algorithm that produces the greatest common divisor of the input polynomials, all in straight-line representation. The second result shows how to find a straight-line program for the reduced numerator and denominator from one for the corresponding rational function. Both the algorithm for that construction and the greatest common divisor algorithm are in random polynomial time for the usual coefficient fields and output a straight-line program, which with controllably high probability correctly determines the requested answer. The running times are polynomial functions in the binary input size, the input degrees as unary numbers, and the logarithm of the inverse of the failure probability. The algorithm for straight-line programs for the numerators and denominators of rational functions implies that every degree-bounded rational function can be computed fast in parallel, that is, in polynomial size and polylogarithmic depth.
The generation of good pseudo-random numbers is the base of many important fields in scientific computing, such as randomized algorithms and numerical solution of stochastic differential equations. In this paper, a cl...
详细信息
The generation of good pseudo-random numbers is the base of many important fields in scientific computing, such as randomized algorithms and numerical solution of stochastic differential equations. In this paper, a class of random number generators (RNGs) based on Weyl sequence is proposed. The uniformity of those RNGs is proved theoretically. Statistical and numerical computations show the efficiency of the methods.
Non-negative Tucker decomposition (NTD) and its graph regularized extensions are the most popular techniques for representing high-dimensional non-negative data, which are typically found in a low-dimensional sub-mani...
详细信息
Non-negative Tucker decomposition (NTD) and its graph regularized extensions are the most popular techniques for representing high-dimensional non-negative data, which are typically found in a low-dimensional sub-manifold of ambient space, from a geometric perspective. Therefore, the performance of the graph-based NTD methods relies heavily on the low-dimensional representation of the original data. However, most existing approaches treat the last factor matrix in NTD as a low-dimensional representation of the original data. This treatment leads to the loss of the original data's multi-linear structure in the low-dimensional subspace. To remedy this defect, we propose a novel graph regularized Lp smooth NTD (GSNTD) method for high dimensional data representation by incorporating graph regularization and an Lp smoothing constraint into NTD. The new graph regularization term constructed by the product of the core tensor and the last factor matrix in NTD, and it is used to uncover hidden semantics while maintaining the intrinsic multi-linear geometric structure of the data. The addition of the Lp smoothing constraint to NTD may produce a more accurate and smoother solution to the optimization problem. The update rules and the convergence of the GSNTD method are proposed. In addition, a randomized variant of the GSNTD algorithm based on fiber sampling is proposed. Finally, the experimental results on four standard image databases show that the proposed method and its randomized variant have better performance than some other state-of-the-art graph-based regularization methods for image clustering.
Reducing the number of phase shifters by grouping antenna elements into subarrays has been extensively studied for decades. The number of phase shifters directly affects the cost, complexity, and power consumption of ...
详细信息
Reducing the number of phase shifters by grouping antenna elements into subarrays has been extensively studied for decades. The number of phase shifters directly affects the cost, complexity, and power consumption of the system. A novel method for the design of phased planar antenna arrays is presented in this work in order to perform a reduction of up to 70% in the number of phase shifters used by the array, while maintaining the desired radiation characteristics. This method consists of creating fusions of subarrays to generate random sequences that form the best feeding network configuration for planar phased arrays. The obtained solution allows scanning the mainlobe at theta=-40(degrees) elevation with a range of scanning of [-75(degrees)
Parallel disks promise to be a cost effective means for achieving high bandwidth in applications involving massive data sets, but algorithms for parallel disks can be difficult to devise. To combat this problem, we de...
详细信息
Parallel disks promise to be a cost effective means for achieving high bandwidth in applications involving massive data sets, but algorithms for parallel disks can be difficult to devise. To combat this problem, we de. ne a useful and natural duality between writing to parallel disks and the seemingly more difficult problem of prefetching. We first explore this duality for applications involving read-once accesses using parallel disks. We get a simple linear time algorithm for computing optimal prefetch schedules and analyze the efficiency of the resulting schedules for randomly placed data and for arbitrary interleaved accesses to striped sequences. Duality also provides an optimal schedule for prefetching plus caching, where blocks can be accessed multiple times. Another application of this duality gives us the first parallel disk sorting algorithms that are provably optimal up to lower-order terms. One of these algorithms is a simple and practical variant of multiway mergesort, addressing a question that had been open for some time.
MapReduce is a programming paradigm for large-scale distributed information processing. This paper proposes a MapReduce algorithm for the minimum vertex cover problem, which is known to be NP-hard. The MapReduce algor...
详细信息
MapReduce is a programming paradigm for large-scale distributed information processing. This paper proposes a MapReduce algorithm for the minimum vertex cover problem, which is known to be NP-hard. The MapReduce algorithm can efficiently obtain a minimal vertex cover in a small number of rounds. We show the effectiveness of the algorithm through experimental evaluation and comparison with exact and approximate algorithms which demonstrates a high quality in a small number of MapReduce rounds. We also confirm from experimentation that the algorithm has good scalability allowing high-quality solutions under restricted computation times due to increased graph size. Moreover, we extend our algorithm to randomized one to obtain a good expected approximate ratio.
We describe an algorithm for the application of the forward and inverse spherical harmonic transforms. It is based on a new method for rapidly computing the forward and inverse associated Legendre transforms by hierar...
详细信息
We describe an algorithm for the application of the forward and inverse spherical harmonic transforms. It is based on a new method for rapidly computing the forward and inverse associated Legendre transforms by hierarchically applying the interpolative decomposition butterfly factorization. Experimental evidence suggests that the complexity of our method-including all necessary precomputations-is O(N-2 log(3) N) in terms of both flops and memory, where N is the order of the transform. This is nearly asymptotically optimal. Moreover, unlike existing algorithms which are asymptotically optimal or nearly so, the constants in the running time and memory costs of our algorithm are small enough to make it competitive with state-of-the-art O(N-3) methods at relatively small values of N (e.g., N = 1024). Numerical results are provided to demonstrate the effectiveness and numerical stability of the new framework.
暂无评论