We present a randomized algorithm for the approximate nearest neighbor problem in d-dimensional Euclidean space. Given N points {x(j)} in R-d, the algorithm attempts to find k nearest neighbors for each of x(j), where...
详细信息
We present a randomized algorithm for the approximate nearest neighbor problem in d-dimensional Euclidean space. Given N points {x(j)} in R-d, the algorithm attempts to find k nearest neighbors for each of x(j), where k is a user-specified integer parameter. The algorithm is iterative, and its CPU time requirements are proportional to T . N . (d . (log d) + k . (d + logk) . (log N)) + N . k(2) . (d + logk), with T the number of iterations performed. The memory requirements of the procedure are of the order N . (d + k). A byproduct of the scheme is a data structure, permitting a rapid search for the k nearest neighbors among {x(j)} for an arbitrary point x is an element of R-d. The cost of each such query is proportional to T . (d . (log d) + log(N/k) . k . (d + logk)), and the memory requirements for the requisite data structure are of the order N . (d + k) + T . (d + N). The algorithm utilizes random rotations and a basic divide-and-conquer scheme, followed by a local graph search. We analyze the scheme's behavior for normally distributed points {x(j)}, and illustrate its performance via several numerical examples. (C) 2012 Elsevier Inc. All rights reserved.
This paper proposes a randomized algorithm for feasibility of uncertain LMIs. The algorithm is based on the solution of a sequence of semidefinite optimization problems involving a reduced number of constraints. A bou...
详细信息
This paper proposes a randomized algorithm for feasibility of uncertain LMIs. The algorithm is based on the solution of a sequence of semidefinite optimization problems involving a reduced number of constraints. A bound of the maximum number of iterations required by the algorithm is given. Finally, the performance and behaviour of the algorithm are illustrated by means of a numerical example.
Autonomic systems exhibit self-managing behavior using various algorithms. Case-based reasoning is one the techniques that enable the autonomic manager to learn from past experience. Case-base is partitioned into some...
详细信息
Autonomic systems exhibit self-managing behavior using various algorithms. Case-based reasoning is one the techniques that enable the autonomic manager to learn from past experience. Case-base is partitioned into some clusters in order to improve the retrieval efficiency. Deciding an appropriate number of clusters for a case-base is not a trivial problem. This paper proposes a randomized algorithm for determining the number of clusters to be formed of the case-base. Subsequently, a binary search-based case retrieval strategy has been applied to ensure enhanced retrieval time performance. The paper presents two versions of the randomized algorithm. The first version guarantees success but its computational cost is a function of random variable;the other guarantees a deterministic computational cost but success is not guaranteed. The performance of the proposed algorithms has been reported on a simulated case study of the Autonomic Forest Fire Application.
Several randomized algorithms make use of convolution to estimate the score vector of matches between a text string of length N and a pattern string of length M, i.e., the vector obtained when the pattern is slid alon...
详细信息
Several randomized algorithms make use of convolution to estimate the score vector of matches between a text string of length N and a pattern string of length M, i.e., the vector obtained when the pattern is slid along the text, and the number of matches is counted for each position. These algorithms run in deterministic time O (kN log M), and find an unbiased estimator of the scores whose variance is (M - c)(2)/k where c is the actual score;here k is an adjustable parameter that provides a tradeoff between computation time and lower variance. This paper provides an algorithm that also runs in deterministic time O (kN log M) but achieves a lower variance of min(M/k, M - c)(M - c)/k. For all score values c that are less than M - (M/k), our variance is essentially a factor of k smaller than in previous work, and for M - (M/k) < c <= M it matches the previous work's variance. As in the previous work, our estimator is unbiased, and we make no assumption about the probabilistic characteristics of the input, or about the size of the alphabet, and our solution extends to string matching with classes, class complements, "never match" and "always match" symbols, to the weighted case and to higher dimensions. (C) 2013 Elsevier B.V. All rights reserved.
randomized algorithms are widely used for finding efficiently approximated solutions to complex problems, for instance primality testing and for obtaining good average behavior. Proving properties of such algorithms r...
详细信息
ISBN:
(纸本)3540356312
randomized algorithms are widely used for finding efficiently approximated solutions to complex problems, for instance primality testing and for obtaining good average behavior. Proving properties of such algorithms requires subtle reasoning both on algorithmic and probabilistic aspects of programs. Thus, providing tools for the mechanization of reasoning is an important issue. This paper presents a new method for proving properties of randomized algorithms in a proof assistant based on higher-order logic. It is based on the monadic interpretation of randomized programs as probabilistic distributions (Giry, Ramsey and Pfeffer). It does not require the definition of an operational semantics for the language nor the development of a complex formalization of measure theory. Instead it uses functional and algebraic properties of unit interval. Using this model, we show the validity of general rules for estimating the probability for a randomized algorithm to satisfy specified properties. This approach addresses only discrete distributions and gives rules for analyzing general recursive functions. We apply this theory to the formal proof of a program implementing a Bernoulli distribution from a coin flip and to the (partial) termination of several programs. All the theories and results presented in this paper have been fully formalized and proved in the CoQ proof assistant. (C) 2009 Elsevier B.V. All rights reserved.
We believe that discontinuous linear information is never more powerful than continuous linear information for approximating continuous operators. We prove such a result in the worst case setting. In the randomized se...
详细信息
We believe that discontinuous linear information is never more powerful than continuous linear information for approximating continuous operators. We prove such a result in the worst case setting. In the randomized setting we consider compact linear operators defined between Hilbert spaces. In this case, the use of discontinuous linear information in the randomized setting cannot be much more powerful than continuous linear information in the worst case setting. These results can be applied when function evaluations are used even if function values are defined only almost everywhere. (C) 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
Hopcroft's problem in d dimensions asks: given n points and n hyperplanes in R~d, does any point lie on any hyperplane? Equivalently, if we are given two sets of n vectors each in R~(d+1), is there a pair of vecto...
详细信息
ISBN:
(纸本)9781611973389
Hopcroft's problem in d dimensions asks: given n points and n hyperplanes in R~d, does any point lie on any hyperplane? Equivalently, if we are given two sets of n vectors each in R~(d+1), is there a pair of vectors (one from each set) that are orthogonal? This problem has a long history and a multitude of applications. It is widely believed that for large d, the problem is subject to the curse of dimensionality: all known algorithms need at least f(d) · n~(2-1/O(d)) time for fast-growing functions f, and at the present time there is little hope that a n-2ε, poly(d) time algorithm will be found. We consider Hopcroft's problem over finite fields and integers modulo composites, leading to both surprising algorithms and hardness reductions. The algorithms arise from studying the communication problem of determining whether two lists of vectors (one list held by Alice, one by Bob) contain an orthogonal pair of vectors over a discrete structure (one from each list). We show the randomized communication complexity of the problem is closely related to the sizes of matching vector families, which have been studied in the design of locally decodable codes. Letting HOPCROFT_R denote Hopcroft's problem over a ring R, we give randomized algorithms and almost matching lower bounds (modulo a breakthrough in SAT algorithms) for HOPCROFT_R, when R is the ring of integers modulo m or a finite field. Building on the ideas developed here, we give a very simple and efficient output-sensitive algorithm for matrix multiplication that works over any field.
Given two polygonal curves in the plane, there are many ways to define a notion of similarity between them. One measure that is extremely popular is the Frechet distance. Since it has been proposed by Alt and Godau in...
详细信息
ISBN:
(纸本)9781611973389
Given two polygonal curves in the plane, there are many ways to define a notion of similarity between them. One measure that is extremely popular is the Frechet distance. Since it has been proposed by Alt and Godau in 1992, many variants and extensions have been studied. Nonetheless, even more than 20 years later, the original O(n~2log n) algorithm by Alt and Godau for computing the Frechet distance remains the state of the art (here n denotes the number of vertices on each curve). This has led Helmut Alt to conjecture that the associated decision problem is 3SUM-hard. In recent work, Agarwal et al. show how to break the quadratic barrier for the discrete version of the Frechet distance, where one considers sequences of points instead of polygonal curves. Building on their work, we give a randomized algorithm to compute the Frechet distance between two polygonal curves in time O(n~2(log n)~(1/2)(log log n)~(3/2)) on a pointer machine and in time O(n~2(log log n)~2) on a word RAM. Furthermore, we show that there exists an algebraic decision tree for the decision problem of depth O(n~(2-ε)), for some ε > 0. This provides evidence that the decision problem may not be 3SUM-hard after all and reveals an intriguing new aspect of this well-studied problem.
We consider the problem of computing a k-sparse approximation to the discrete Fourier transform of an n-dimensional signal. Our main result is a randomized algorithm that computes such an approximation using O(k log n...
详细信息
ISBN:
(纸本)9781611973389
We consider the problem of computing a k-sparse approximation to the discrete Fourier transform of an n-dimensional signal. Our main result is a randomized algorithm that computes such an approximation using O(k log n(log log n)~(O(1))) signal samples in time O(k log~2 n(log log n)~(O(1))), assuming that the entries of the signal are polynomially bounded. The sampling complexity improves over the recent bound of O(k log n log(n=k)) given in [15], and matches the lower bound of Ω(k log(n=k)= log log n) from the same paper up to poly(log log n) factors when k = O(n~(1-δ)) for a constant δ > 0.
We present randomized algorithms for sampling the standard Gaussian distribution restricted to a convex set and for estimating the Gaussian measure of a convex set, in the general membership oracle model. The complexi...
详细信息
ISBN:
(纸本)9781611973389
We present randomized algorithms for sampling the standard Gaussian distribution restricted to a convex set and for estimating the Gaussian measure of a convex set, in the general membership oracle model. The complexity of the integration algorithm is O*(n~3) while the complexity of the sampling algorithm is O*(n~3) for the first sample and O*(n~2) for every subsequent sample. These bounds improve on the corresponding state-of-the-art by a factor of n. Our improvement comes from several aspects: better isoperimetry, smoother annealing, avoiding trans-formation to isotropic position and the use of the "speedy walk" in the analysis.
暂无评论