Truncated multilinear UTV decomposition (TMLUTVD) is an efficient method to extract the most dominant features of a given tensor in various practical applications, such as tensor tracking. However, the computation of ...
详细信息
Truncated multilinear UTV decomposition (TMLUTVD) is an efficient method to extract the most dominant features of a given tensor in various practical applications, such as tensor tracking. However, the computation of TMLUTVD can be time-consuming, especially for large-scale data. randomized methods are known for their ability to reduce computational costs, particularly when dealing with the low-rank approximation of large tensors. Therefore, in this paper, we develop randomized algorithms for computing the multilinear UTV decomposition. Specifically, we propose randomized versions of TMLUTVD using randomized sampling schemes and the power method technique, which is an extension of the existing randomized matrix method. They are more efficient when applied to very large datasets compared with deterministic methods, and a detailed probabilistic error analysis of these algorithms is provided. We further introduce two novel variants of these randomized algorithms, based on distinct computational challenges inherent in processing large-scale datasets. The first variant can adaptively find a low-rank representation that satisfies a given tolerance when the target rank is not known in advance. The second variant preserves the original tensor structure and is particularly effective for managing large-scale sparse tensors that are challenging to load into memory. Some numerical results are presented to illustrate the efficiency and effectiveness of the proposed methods.
This article treats AND -OR tree computation in terms of query complexity. We are interested in the cases where assignments (inputs) or algorithms are randomized. For the former case, it is known that there is a uniqu...
详细信息
This article treats AND -OR tree computation in terms of query complexity. We are interested in the cases where assignments (inputs) or algorithms are randomized. For the former case, it is known that there is a unique randomized assignment achieving the distributional complexity of balanced trees. On the other hand, the dual problem has the opposite result;the optimal randomized algorithms for balanced trees are not unique. We extend the latter study on randomized algorithms to weakly -balanced trees, and see that the uniqueness still fails.
Traditional low-rank approximation is a powerful tool for compressing large data matrices that arise in simulations of partial differential equations (PDEs), but suffers from high computational cost and requires sever...
详细信息
Traditional low-rank approximation is a powerful tool for compressing large data matrices that arise in simulations of partial differential equations (PDEs), but suffers from high computational cost and requires several passes over the PDE data. The compressed data may also lack interpretability thus making it difficult to identify feature patterns from the original data. To address these issues, we present an online randomized algorithm to compute the interpolative decomposition (ID) of large-scale data matrices in situ. Compared to previous randomized IDs that used the QR decomposition to determine the column basis, we adopt a streaming ridge leverage score-based column subset selection algorithm that dynamically selects proper basis columns from the data and thus avoids an extra pass over the data to compute the coefficient matrix of the ID. In particular, we adopt a single-pass error estimator based on the non- adaptive Hutch++ algorithm to provide real-time error approximation for determining the best coefficients. As a result, our approach only needs a single pass over the original data and thus is suitable for large and high-dimensional matrices stored outside of core memory or generated in PDE simulations. A strategy to improve the accuracy of the reconstructed data gradient, when desired, within the ID framework is also presented. We provide numerical experiments on turbulent channel flow and ignition simulations, and on the NSTX Gas Puff Image dataset, comparing our algorithm with the offline ID algorithm to demonstrate its utility in real-world applications.
In this paper, we propose novel quaternion matrix UTV (QUTV) and quaternion tensor UTV (QTUTV) decomposition methods, specifically designed for color image and video processing. We begin by defining both QUTV and QTUT...
详细信息
In this paper, we propose novel quaternion matrix UTV (QUTV) and quaternion tensor UTV (QTUTV) decomposition methods, specifically designed for color image and video processing. We begin by defining both QUTV and QTUTV decompositions and provide detailed algorithmic descriptions. To enhance computational efficiency, we introduce randomized versions of these decompositions using random sampling from the quaternion normal distribution, which results in cost-effective and interpretable solutions. Extensive numerical experiments demonstrate that the proposed algorithms significantly improve computational efficiency while maintaining relative errors comparable to existing decomposition methods. These results underscore the strong potential of quaternion-based decompositions for real-world color image and video processing applications. Theoretical findings further support the robustness of the proposed methods, providing a solid foundation for their widespread use in practice.
We present a randomized approach for wait-free locks with strong bounds on time and fairness in a context in which any process can be arbitrarily delayed. Our approach supports a tryLock operation that is given a set ...
详细信息
We present a randomized approach for wait-free locks with strong bounds on time and fairness in a context in which any process can be arbitrarily delayed. Our approach supports a tryLock operation that is given a set of locks, and code to run when all the locks are acquired. A tryLock operation may fail if there is contention on the locks, in which case the code is not run. Given an upper bound kappa\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\kappa $$\end{document} known to the algorithm on the point contention of any lock, and an upper bound L on the number of locks in a tryLock's set, a tryLock will succeed in acquiring its locks and running the code with probability at least 1/(kappa L)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1/(\kappa L)$$\end{document}. It is thus fair. Furthermore, if the maximum step complexity for the code in any lock is T, the operation will take O(kappa 2L2T)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O(\kappa <^>2 L<^>2 T)$$\end{document} steps, regardless of whether it succeeds or fails. The operations are independent, thus if the tryLock is repeatedly retried on failure, it will succeed in O(kappa 3L3T)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O(\kappa <^>3 L<^>3 T)$$\end{document} expected steps. If
Tensor decomposition algorithms are essential for extracting meaningful latent variables and uncovering hidden structures in real-world data tensors. Unlike conventional deterministic tensor decomposition algorithms, ...
详细信息
Tensor decomposition algorithms are essential for extracting meaningful latent variables and uncovering hidden structures in real-world data tensors. Unlike conventional deterministic tensor decomposition algorithms, randomized methods offer higher efficiency by reducing memory requirements and computational complexity. This paper proposes an efficient hardware architecture for a randomized tensor decomposition implemented on a field-programmable gate array (FPGA) using high-level synthesis (HLS). The proposed architecture integrates random projection, power iteration, and subspace approximation via QR decomposition to achieve low-rank approximation of multidimensional datasets. The proposed architecture utilizes the capabilities of reconfigurable systems to accelerate tensor computation. It includes three central units: (1) tensor times matrix chain (TTMc), (2) tensor unfolding unit, and (3) QR decomposition unit to implement a three-stage algorithm. Experimental results demonstrate that our FPGA design achieves up to 14.56 times speedup compared to the well-implemented tensor decomposition using software library Tensor Toolbox on an Intel i7-9700 CPU. For a large input tensor of size 512x512x512\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$512 \times 512 \times 512$$\end{document}, the proposed design achieves a 5.55 times speedup compared to an Nvidia Tesla T4 GPU. Furthermore, we utilize our hardware-based high-order singular value decomposition (HOSVD) accelerator for two real applications: background subtraction of dynamic video datasets and data compression. In both applications, our proposed design shows high efficiency regarding accuracy and computational time.
In this paper, we study online knapsack problems. The input is a sequence of items e(1), e(2), ..., e(n), each of which has a size and a value. Given the ith item e(i), we either put ei into the knapsack or reject it....
详细信息
In this paper, we study online knapsack problems. The input is a sequence of items e(1), e(2), ..., e(n), each of which has a size and a value. Given the ith item e(i), we either put ei into the knapsack or reject it. In the removable setting, when ei is put into the knapsack, some items in the knapsack are removed with no cost if the sum of the size of ei and the total size in the current knapsack exceeds the capacity of the knapsack. Our goal is to maximize the profit, i.e., the sum of the values of items in the last knapsack. We present a simple randomized 2-competitive algorithm for the unweighted non-removable case and show that it is the best possible, where knapsack problem is called unweighted if the value of each item is equal to its size. For the removable case, we propose a randomized 2-competitive algorithm despite there is no constant competitive deterministic algorithm. We also provide a lower bound 1 + 1/e approximate to 1.368 for the competitive ratio. For the unweighted removable case, we propose a 10/7-competitive algorithm and provide a lower bound 1.25 for the competitive ratio. (C) 2014 Elsevier B.V. All rights reserved.
We prove lower bounds on the competitive ratio of randomized algorithms for several on-line scheduling problems. The main result is a bound of e/(e - 1) for the on-line problem with objective minimizing the sum of com...
详细信息
We prove lower bounds on the competitive ratio of randomized algorithms for several on-line scheduling problems. The main result is a bound of e/(e - 1) for the on-line problem with objective minimizing the sum of completion times of jobs that arrive over time at their release times and are to be processed on a single machine, This lower bound shows that a randomized algorithm designed in Chekuri et al. (Proceedings of the Eighth ACM-SIAM Symposium on Discrete algorithms, 1997, 609 -618) is a best possible randomized algorithm for this problem. (C) 2002 Elsevier Science B.V. All rights reserved.
Consider a set of items, X, with a total of n items, among which a subset, denoted as I subset of X, consists of defective items. In the context of group testing, a test is conducted on a subset of items Q, where Q su...
详细信息
Consider a set of items, X, with a total of n items, among which a subset, denoted as I subset of X, consists of defective items. In the context of group testing, a test is conducted on a subset of items Q, where Q subset of X. The result of this test is positive, yielding 1, if Q includes at least one defective item, that is if Q boolean AND I not equal empty set. It is negative, yielding 0, if no defective items are present in Q. We introduce a novel method for deriving lower bounds in the context of non-adaptive randomized group testing. For any given constant j, any non-adaptive randomized algorithm that, with probability at least 2/3, estimates the number of defective items |I| within a constant factor requires at least Omega (log n/ log log center dot center dot center dot(j) log n) tests. Our result almost matches the upper bound of O(log n) and addresses the open problem posed by Damaschke and Sheikh Muhammad in (Combinatorial Optimization and Applications - 4th International Conference, COCOA 2010, pp 117-130, 2010;Discrete Math Alg Appl 2(3):291-312, 2010). Furthermore, it enhances the previously established lower bound of Omega(log n/log log n) by Ron and Tsur (ACM Trans Comput Theory 8(4): 15:1-15:19, 2016), and independently by Bshouty (30th International Symposium on algorithms and Computation, ISAAC 2019, LIPIcs, vol 149, pp 2:1-2:9, 2019). For estimation within a non-constant factor alpha(n), we show: If a constant j exists such that alpha>log log center dot center dot center dot(j) log n, then any non-adaptive randomized algorithm that, with probability at least 2/3, estimates the number of defective items |I| to within a factor alpha requires at least Omega(log n / log alpha). In this case, the lower bound is tight.
This survey provides an introduction to the use of randomization in the design of fast algorithms for numerical linear algebra. These algorithms typically examine only a subset of the input to solve basic problems app...
详细信息
This survey provides an introduction to the use of randomization in the design of fast algorithms for numerical linear algebra. These algorithms typically examine only a subset of the input to solve basic problems approximately, including matrix multiplication, regression and low-rank approximation. The survey describes the key ideas and gives complete proofs of the main results in the field. A central unifying idea is sampling the columns (or rows) of a matrix according to their squared lengths.
暂无评论