Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends ...
详细信息
Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed-either explicitly or implicitly-to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, robustness, and/or speed. These claims are supported by extensive numerical experiments and a detailed error analysis. The specific benefits of randomized techniques depend on the computational environment. Consider the model problem of finding the k dominant components of the singular value decomposition of an m x n matrix. (i) For a dense input matrix, randomized algorithms require O(mnlog(k)) floating-point operations (flops) in contrast to O(mnk) for classical algorithms. (ii) For a sparse input matrix, the flop count matches classical Krylov subspace methods, but the randomized approach is more robust and can easily be reorganized to exploit multiprocessor architectures. (iii) For a matrix that is too large to fit in fast memory, the randomized techniques require only a constant number of passes over the data, as opposed to O(k) passes for classical algorithms. In fact, it is sometimes possible to perform matrix approximation with a single pass over the data.
Attribute to its powerful representation ability, block term decomposition (BTD) has recently attracted many views of multi-dimensional data processing, e.g., hyperspectral image unmixing and blind source separation. ...
详细信息
Attribute to its powerful representation ability, block term decomposition (BTD) has recently attracted many views of multi-dimensional data processing, e.g., hyperspectral image unmixing and blind source separation. However, the popular alternating least squares algorithm for rank-(L, M, N) BTD (BTD-ALS) suffers expensive time and space costs from Kronecker products and solving low-rank approximation subproblems, hindering the deployment of BTD for real applications, especially for large-scale data. In this paper, we propose a fast sketching-based Kronecker product-free algorithm for rank-(L, M, N) BTD (termed as KPF-BTD), which is suitable for real-world multi-dimensional data. Specifically, we first decompose the original optimization problem into several rank-(L, M, N) approximation subproblems, and then we design the bilateral sketching to obtain the approximate solutions of these subproblems instead of the exact solutions, which allows us to avoid Kronecker products and rapidly solve rank-(L, M, N) approximation subproblems. As compared with BTD-ALS, the time and space complexities O(2(p+1)((ILR)-L-3+(ILR)-L-2-R-2+(ILR)-R-3)+(ILR)-L-3) and O(I-3) of KPF-BTD are significantly cheaper than O((ILR2)-L-3-R-6+(ILR)-L-3-R-3+(ILR)-L-3+(ILR2)-L-2-R-3+(ILR)-L-2-R-2) and O((ILR)-L-3-R-3) of BTD-ALS, where p << I. Moreover, we establish the theoretical error bound for KPF-BTD. Extensive synthetic and real experiments show KPF-BTD achieves substantial speedup and memory saving while maintaining accuracy (e.g., for a 150x150x150 synthetic tensor, the running time 0.2 seconds per iteration of KPF-BTD is significantly faster than 96.2 seconds per iteration of BTD-ALS while their accuracies are comparable).
A subgraph induced by k vertices is called a k-induced subgraph. We prove that determining if a digraph G contains H-free k-induced subgraphs is Omega(N-2)-evasive. Then we construct an is an element of-tester to test...
详细信息
A subgraph induced by k vertices is called a k-induced subgraph. We prove that determining if a digraph G contains H-free k-induced subgraphs is Omega(N-2)-evasive. Then we construct an is an element of-tester to test this property. (An E-tester fora property Pi is guaranteed to distinguish, with probability at least 2/3, between the case of G satisfying Pi and the case of G being is an element of-far from satisfying Pi.)The query complexity of the E-tester is independent of the size of the input digraph. An (E,)is an element of-tester fora property Pi is an E-tester for Pi that is furthermore guaranteed to accept with probability at least 2/3 any input that is delta-close to satisfying Pi. This paper presents an (is an element of, delta)-tester for whether a digraph contains H-free k-induced subgraphs. (C) 2008 Elsevier B.V. All rights reserved.
This paper addresses the issues of conservativeness and computational complexity of robust control. A new probabilistic robust control method is proposed to design a high performance controller. The key of the new met...
详细信息
This paper addresses the issues of conservativeness and computational complexity of robust control. A new probabilistic robust control method is proposed to design a high performance controller. The key of the new method is that the uncertainty set is divided into two parts: r-subset and the complementary set of r-subset. The contributions of the new method are as follows: (i) a deterministic robust controller is designed for r-subset, so it has less conservative than those designed by using deterministic robust control method for the full set;and (ii) the probabilistic robustness of the designed controller is evaluated just for the complementary set of r-subset but not for the full set, so the computational complexity of the new method is reduced. Given expected probability robustness, a pertinent probabilistic robust controller can be designed by adjusting the norm boundary of r-subset. The effectiveness of the proposed method is verified by the simulation example.
We consider unconstrained randomized optimization of convex objective functions. We analyze the Random Pursuit algorithm, which iteratively computes an approximate solution to the optimization problem by repeated opti...
详细信息
We consider unconstrained randomized optimization of convex objective functions. We analyze the Random Pursuit algorithm, which iteratively computes an approximate solution to the optimization problem by repeated optimization over a randomly chosen one-dimensional subspace. This randomized method only uses zeroth-order information about the objective function and does not need any problem-specific parametrization. We prove convergence and give convergence rates for smooth objectives assuming that the one-dimensional optimization can be solved exactly or approximately by an oracle. A convenient property of Random Pursuit is its invariance under strictly monotone transformations of the objective function. It thus enjoys identical convergence behavior on a wider function class. To support the theoretical results we present extensive numerical performance results of Random Pursuit, two gradient-free algorithms recently proposed by Nesterov, and a classical adaptive step size random search scheme. We also present an accelerated heuristic version of the Random Pursuit algorithm which significantly improves standard Random Pursuit on all numerical benchmark problems. A general comparison of the experimental results reveals that (i) standard Random Pursuit is effective on strongly convex functions with moderate condition number and (ii) the accelerated scheme is comparable to Nesterov's fast gradient method and outperforms adaptive step size strategies.
Under some weak conditions, the first-passage time of the Brownian motion to a continuous curved boundary is an almost surely finite stopping time. Its probability density function (pdf) is explicitly known only in fe...
详细信息
Under some weak conditions, the first-passage time of the Brownian motion to a continuous curved boundary is an almost surely finite stopping time. Its probability density function (pdf) is explicitly known only in few particular cases. Several mathematical studies proposed to approximate the pdf in a quite general framework or even to simulate this hitting time using a discrete time approximation of the Brownian motion. The authors study a new algorithm which permits one to simulate the first-passage time using an iterating procedure. The convergence rate presented in this paper suggests that the method is very efficient.
We consider auctions of indivisible items to unit-demand bidders with budgets. This setting was suggested as an expressive model for single sponsored search auctions. Prior work presented mechanisms that compute bidde...
详细信息
We consider auctions of indivisible items to unit-demand bidders with budgets. This setting was suggested as an expressive model for single sponsored search auctions. Prior work presented mechanisms that compute bidder-optimal outcomes and are truthful for a restricted set of inputs, i.e., inputs in so-called general position. This condition is easily violated. We provide the first mechanism that is truthful in expectation for all inputs and achieves for each bidder no worse utility than the bidder-optimal outcome. Additionally we give a complete characterization for which inputs mechanisms that compute bidder-optimal outcomes are truthful. (C) 2015 Elsevier B.V. All rights reserved.
Modern software systems often consist of many different components, each with a number of options. Although unit tests may reveal faulty options for individual components, functionally correct components may interact ...
详细信息
Modern software systems often consist of many different components, each with a number of options. Although unit tests may reveal faulty options for individual components, functionally correct components may interact in unforeseen ways to cause a fault. Covering arrays are used to test for interactions among components systematically. A two-stage framework, providing a number of concrete algorithms, is developed for the efficient construction of covering arrays. In the first stage, a time and memory efficient randomized algorithm covers most of the interactions. In the second stage, a more sophisticated search covers the remainder in relatively few tests. In this way, the storage limitations of the sophisticated search algorithms are avoided;hence, the range of the number of components for which the algorithm can be applied is extended, without increasing the number of tests. Many of the framework instantiations can be tuned to optimize a memory-quality trade-off, so that fewer tests can be achieved using more memory.
We consider the Scenario Convex Program (SCP) for two classes of optimization problems that are not tractable in general: Robust Convex Programs (RCPs) and Chance-Constrained Programs (CCPs). We establish a probabilis...
详细信息
We consider the Scenario Convex Program (SCP) for two classes of optimization problems that are not tractable in general: Robust Convex Programs (RCPs) and Chance-Constrained Programs (CCPs). We establish a probabilistic bridge from the optimal value of SCP to the optimal values of RCP and CCP in which the uncertainty takes values in a general, possibly infinite dimensional, metric space. We then extend our results to a certain class of non-convex problems that includes, for example, binary decision variables. In the process, we also settle a measurability issue for a general class of scenario programs, which to date has been addressed by an assumption. Finally, we demonstrate the applicability of our results on a benchmark problem and a problem in fault detection and isolation.
Radio frequency identification (RFID) technology has rich applications in cyber-physical systems, such as warehouse management and supply chain control. Often in practice, tags are attached to objects belonging to dif...
详细信息
Radio frequency identification (RFID) technology has rich applications in cyber-physical systems, such as warehouse management and supply chain control. Often in practice, tags are attached to objects belonging to different groups, which may be different product types/manufacturers in a warehouse or different book categories in a library. As RFID technology evolves from single-group to multiple-group systems, there arise several interesting problems. One of them is to identify the popular groups, whose numbers of tags are above a pre-defined threshold. Another is to estimate arbitrary moments of the group size distribution, such as sum, variance, and entropy for the sizes of all groups. In this paper, we consider a new problem which is to estimate all these statistical metrics simultaneously in a time-efficient manner without collecting any tag IDs. We solve this problem by a protocol named generic moment estimator (GME), which allows the tradeoff between estimation accuracy and time cost. According to the results of our theoretical analysis and simulation studies, this GME protocol is several times or even orders of magnitude more efficient than a baseline protocol that takes a random sample of tag groups to estimate each group size.
暂无评论