We present a probabilistic algorithm that, given a connected graph G ( represented by adjacency lists) of average degree d, with edge weights in the set {1,..., w}, and given a parameter 0 < epsilon < 1/ 2, esti...
详细信息
We present a probabilistic algorithm that, given a connected graph G ( represented by adjacency lists) of average degree d, with edge weights in the set {1,..., w}, and given a parameter 0 < epsilon < 1/ 2, estimates in time O(dw epsilon(-2) log dw/epsilon) the weight of the minimum spanning tree (MST) of G with a relative error of at most e. Note that the running time does not depend on the number of vertices in G. We also prove a nearly matching lower bound of Omega(dw epsilon(-2)) on the probe and time complexity of any approximation algorithm for MST weight. The essential component of our algorithm is a procedure for estimating in time O(d epsilon(-2) log d/epsilon) the number of connected components of an unweighted graph to within an additive error of epsilon n. ( This becomes O(epsilon(-2) log 1/epsilon) for d = O(1).) The time bound is shown to be tight up to within the log d/epsilon factor. Our connected-components algorithm picks O(1/epsilon(2)) vertices in the graph and then grows "local spanning trees" whose sizes are specified by a stochastic process. From the local information collected in this way, the algorithm is able to infer, with high confidence, an estimate of the number of connected components. We then show how estimates on the number of components in various subgraphs of G can be used to estimate the weight of its MST.
We present a methodological approach for validation of advanced driver assistance systems, based on randomized algorithms. The new methodology is more efficient than conventional validation by simulations and field te...
详细信息
We present a methodological approach for validation of advanced driver assistance systems, based on randomized algorithms. The new methodology is more efficient than conventional validation by simulations and field tests, especially with increasing system complexity. The methodology consists of first specifying the perturbation set and performance criteria. Then a minimum required number of samples and a relevant sampling space is selected. Next an iterative randomized simulation is executed, followed by validation with hardware tests. The concept is illustrated with a simple adaptive cruise control problem.
A common statistical problem is that of finding the median element in a set of data. This paper presents an efficient randomized high-level parallel algorithm for finding the median given a set of elements distributed...
详细信息
A common statistical problem is that of finding the median element in a set of data. This paper presents an efficient randomized high-level parallel algorithm for finding the median given a set of elements distributed across a parallel machine. In fact, our algorithm solves the general selection problem that requires the determination of the element of rank k, for an arbitrarily given integer k. Our general framework is an SPMD distributed-memory programming model that is enhanced by a set of communication primitives. We use efficient techniques for distributing and coalescing data as well as efficient combinations of task and data parallelism. The algorithms have been coded in the message-passing standard MPI, and our experimental results from the IBM SP-2 illustrate the scalability and efficiency of our algorithm and improve upon all the related experimental results known to the author. The main contributions of this paper are (1) New techniques for speeding the performance of certain randomized algorithms, such as selection, which are efficient with likely probability. (2) A new, practical randomized selection algorithm (UltraFast) with significantly improved convergence. (C) 2004 Elsevier Inc. All rights reserved.
We present an O(n log(4)n)-time randomized algorithm for gossiping in radio networks with unknown topology. This is the first algorithm for gossiping in this model whose running time is only a polylogarithmic factor a...
详细信息
We present an O(n log(4)n)-time randomized algorithm for gossiping in radio networks with unknown topology. This is the first algorithm for gossiping in this model whose running time is only a polylogarithmic factor away from the optimum. The fastest previously known (deterministic) algorithm for this problem works in time O(n(3/2)log(2)n). (C) 2004 Wiley Periodicals, Inc.
We consider the problem of partitioning a set of m points in the n-dimensional Euclidean space into k clusters ( usually m and n are variable, while k is fixed), so as to minimize the sum of squared distances between ...
详细信息
We consider the problem of partitioning a set of m points in the n-dimensional Euclidean space into k clusters ( usually m and n are variable, while k is fixed), so as to minimize the sum of squared distances between each point and its cluster center. This formulation is usually the objective of the k-means clustering algorithm (Kanungo et al. ( 2000)). We prove that this problem in NP-hard even for k = 2, and we consider a continuous relaxation of this discrete problem: find the k-dimensional subspace V that minimizes the sum of squared distances to V of the m points. This relaxation can be solved by computing the Singular Value Decomposition (SVD) of the m x n matrix A that represents the m points;this solution can be used to get a 2-approximation algorithm for the original problem. We then argue that in fact the relaxation provides a generalized clustering which is useful in its own right. Finally, we show that the SVD of a random submatrix-chosen according to a suitable probability distribution of a given matrix provides an approximation to the SVD of the whole matrix, thus yielding a very fast randomized algorithm. We expect this algorithm to be the main contribution of this paper, since it can be applied to problems of very large size which typically arise in modern applications.
Computation of the sign of the determinant of a matrix and the determinantitself is a challenge for both numerical and exact methods. We survey the complexity of existingmethods to solve these problems when the input ...
详细信息
Computation of the sign of the determinant of a matrix and the determinantitself is a challenge for both numerical and exact methods. We survey the complexity of existingmethods to solve these problems when the input is an n * n matrix A with integer entries. We studythe bit-complexities of the algorithms asymptotically in n and the norm of A. Existing approachesrely on numerical approximate computations, on exact computations, or on both types of arithmetic incombination.
In this work we study the problem of clustering with respect to the diameter and the radius costs: We say that a set X of points in R-d is (k, b)-clusterable with respect to the diameter cost if X can be partitioned i...
详细信息
In this work we study the problem of clustering with respect to the diameter and the radius costs: We say that a set X of points in R-d is (k, b)-clusterable with respect to the diameter cost if X can be partitioned into k subsets (clusters) so that the distance between every pair of points in each cluster is at most b. In the case of the radius cost we require that all points that belong to the same cluster be at a distance of at most b for some common central point. Here we approach the problem of clustering from within the framework of property testing. In property testing, the goal is to determine whether a given object has a particular property or whether it should be modified significantly so that it obtains the property. In the context of clustering, testing takes on the following form: The algorithm is given parameters k, b, beta, and epsilon, and it can sample from the set of points X. The goal of the algorithm is to distinguish between the case when X is (k, b)-clusterable and the case when X is c-far from being (k, (1 + beta)b)-clusterable. By epsilon-far from being (k, (1 + beta)b)-clusterable we mean that more than epsilon . \X\ points should be removed from X so that it becomes (k, (1, + beta)b)-clusterable. In this work we describe and analyze algorithms that use a sample of size polynomial in k and 1/epsilon and independent of \X\. (The dependence on beta and on the dimension, d, of the points varies with the different algorithms.) Such algorithms may be especially useful when the set of points X is very large and it may not even be feasible to observe all of it. Our algorithms can also be used to find approximately good clusterings. Namely, these are clusterings of all but an c-fraction of the points in X that have optimal (or close to optimal) cost. The benefit of our algorithms is that they construct an implicit representation of such clusterings in time independent of \X\. That is, without actually having to partition all points in X, the implici
In this paper we study the classical problem of finding disjoint paths in graphs. This problem has been studied by a number of authors both for specific graphs and general classes of graphs. Whereas for specific graph...
详细信息
In this paper we study the classical problem of finding disjoint paths in graphs. This problem has been studied by a number of authors both for specific graphs and general classes of graphs. Whereas for specific graphs many (almost) matching upper and lower bounds are known for the competitiveness of on-line algorithms, not much is known about how well on-line algorithms can perform in the general setting. The best results obtained so far use the expansion of a network to measure the algorithm's performance. We use a different parameter called the routing number that, as we will show, allows more precise results than the expansion. It enables us to prove tight upper and lower bounds for deterministic on-line algorithms. The upper bound is obtained by surprisingly simple greedy-like algorithms. Interestingly, our upper bound on the competitive ratio is even better than the best previous approximation ratio for off-line algorithms. Furthermore, we introduce a refined variant of the routing number and show that this variant allows us, for some classes of graphs, to construct on-line algorithms with a competitive ratio significantly below the best possible upper bound that could be obtained using the routing number or the expansion of a network only. We also show that our on-line algorithms can be transformed into efficient algorithms for the more general unsplittable flow problem.
Scheduling a sequence of jobs released over time when the processing time of a job is only known at its completion is a classical problem in CPU scheduling in time sharing operating systems. A widely used measure for ...
详细信息
Scheduling a sequence of jobs released over time when the processing time of a job is only known at its completion is a classical problem in CPU scheduling in time sharing operating systems. A widely used measure for the responsiveness of the system is the average flow time of the jobs, that is, the average time spent by jobs in the system between release and completion. The Windows NT and the Unix operating system scheduling policies are based on the Multilevel Feedback algorithm. In this article, we prove that a randomized version of the Multilevel Feedback algorithm is competitive for single and parallel machine systems, in our opinion providing one theoretical validation of the goodness of an idea that has proven effective in practice along the last two decades. The randomized Multilevel Feedback algorithm (RMLF) was first proposed by Kalyanasundaram and Pruhs for a single machine achieving an O(log n log log n) competitive ratio to minimize the average flow time against the on-line adaptive adversary, where n is the number of jobs that are released. We present a version of RMLF working for any number m of parallel machines. We show for RMLF a first O(log n log L) competitiveness result against the oblivious adversary on parallel machines. We also show that the same RMLF algorithm surprisingly achieves a tight O(log n) competitive ratio against the oblivious adversary on a single machine, therefore matching the lower bound for this case.
In this paper we consider the problem of testing bipartiteness of general graphs. The problem has previously been studied in two models, one most suitable for dense graphs and one most suitable for bounded-degree grap...
详细信息
In this paper we consider the problem of testing bipartiteness of general graphs. The problem has previously been studied in two models, one most suitable for dense graphs and one most suitable for bounded-degree graphs. Roughly speaking, dense graphs can be tested for bipartiteness with constant complexity, while the complexity of testing bounded-degree graphs is (Theta) over tilde (rootn), where n is the number of vertices in the graph (and (Theta) over tilde (f(n)) means Theta (f(n) . polylog(f(n)))). Thus there is a large gap between the complexity of testing in the two cases. In this work we bridge the gap described above. In particular, we study the problem of testing bipartiteness in a model that is suitable for all densities. We present an algorithm whose complexity is (O) over tilde (min(rootn, n(2)/m)), where m is the number of edges in the graph, and we match it with an almost tight lower bound.
暂无评论