A fundamental algorithm for selecting ranks from a finite subset of an ordered set is Radix Selection. This algorithm requires the data to be given as strings of symbols over an ordered alphabet, e.g., binary expansio...
详细信息
A fundamental algorithm for selecting ranks from a finite subset of an ordered set is Radix Selection. This algorithm requires the data to be given as strings of symbols over an ordered alphabet, e.g., binary expansions of real numbers. Its complexity is measured by the number of symbols that have to be read. In this paper the model of independent data identically generated from a Markov chain is considered. The complexity is studied as a stochastic process indexed by the set of infinite strings over the given alphabet. The orders of mean and variance of the complexity and, after normalization, a limit theorem with a centered Gaussian process as limit are derived. This implies an analysis for two standard models for the ranks: uniformly chosen ranks, also called grand averages, and the worst case rank complexities which are of interest in computer science. For uniform data and the asymmetric Bernoulli model (i.e, memoryless sources), we also find weak convergence for the normalized process of complexities when indexed by the ranks while for more general Markov sources these processes are not tight under the standard normalizations. (C) 2018 Elsevier B.V. All rights reserved.
The algorithm LDM (largest differencing method) divides a list of n random items into two blocks. The parameter of interest is the expected difference between the two block sums. It is shown that if the items are i.i....
详细信息
The algorithm LDM (largest differencing method) divides a list of n random items into two blocks. The parameter of interest is the expected difference between the two block sums. It is shown that if the items are i.i.d. and uniform then the rate of convergence of this parameter to zero is n(-Theta(log n)). An algorithm for balanced partitioning is constructed, with the same rate of convergence to zero.
Recently Avis and Jordan have demonstrated the efficiency of a simple technique called budgeting for the parallelization of a number of tree search algorithms. The idea is to limit the amount of work that a processor ...
详细信息
Recently Avis and Jordan have demonstrated the efficiency of a simple technique called budgeting for the parallelization of a number of tree search algorithms. The idea is to limit the amount of work that a processor performs before it terminates its search and returns any unexplored nodes to a master process. This limit is set by a critical budget parameter which determines the overhead of the process. In this paper we study the behaviour of the budget parameter on conditional Galton-Watson trees obtaining asymptotically tight bounds on this overhead. We present empirical results to show that this bound is surprisingly accurate in practice.
We modify the k-d tree on [0, 1](d) by always cutting the longest edge instead of rotating through the coordinates. This modi cation makes the expected time behavior of lower-dimensional partial match queries behave a...
详细信息
We modify the k-d tree on [0, 1](d) by always cutting the longest edge instead of rotating through the coordinates. This modi cation makes the expected time behavior of lower-dimensional partial match queries behave as perfectly balanced complete k-d trees on n nodes. This is in contrast to a result of Flajolet and Puech [J. Assoc. Comput. Mach., 33 (1986), pp. 371-407], who proved that for (standard) random k-d trees with cuts that rotate among the coordinate axes, the expected time behavior is much worse than for balanced complete k-d trees. We also provide results for range searching and nearest neighbor search for our trees.
Tail distribution bounds play a major role in the estimation of failure probabilities in performance and reliability analysis of systems. They are usually estimated using Markov's and Chebyshev's inequalities,...
详细信息
Tail distribution bounds play a major role in the estimation of failure probabilities in performance and reliability analysis of systems. They are usually estimated using Markov's and Chebyshev's inequalities, which represent tail distribution bounds for a random variable in terms of its mean or variance. This paper presents the formal verification of Markov's and Chebyshev's inequalities for discrete random variables using a higher-order-logic theorem prover. The paper also provides the formal verification of mean and variance relations for some of the widely used discrete random variables, such as Uniform(m), Bernoulli(p), Geometric(p) and Binomial(m, p) random variables. This infrastructure allows us to precisely reason about the tail distribution properties and thus turns out to be quite useful for the analysis of systems used in safety-critical domains, such as space, medicine or transportation. For illustration purposes, we present the performance analysis of the coupon collector's problem, a well-known commercially used algorithm. Copyright (C) 2008 John Wiley & Sons, Ltd.
In this paper, an entirely novel discrete probabilistic model is presented to generate 0-1 Knapsack Problem instances. We analyze the expected behavior of the greedy algorithm, the eligible-first algorithm and the lin...
详细信息
In this paper, an entirely novel discrete probabilistic model is presented to generate 0-1 Knapsack Problem instances. We analyze the expected behavior of the greedy algorithm, the eligible-first algorithm and the linear relaxation algorithm for these instances;all used to bound the solution of the 0-1 Knapsack Problem (0-1 KP) and/or its approximation. The probabilistic setting is given and the main random variables are identified. The expected performance for each of the aforementioned algorithms is analytically established in closed forms in an unprecedented way.
A very simple example of an algorithmic problem solvable by dynamic programming is to maximize, over A subset of {1, 2,..., n}, the objective function |A|-Sigma(i)xi(i)1(i is an element of A, i + 1 is an element of A)...
详细信息
A very simple example of an algorithmic problem solvable by dynamic programming is to maximize, over A subset of {1, 2,..., n}, the objective function |A|-Sigma(i)xi(i)1(i is an element of A, i + 1 is an element of A) for given xi(i) > 0. This problem, with random (xi(i)), provides a test example for studying the relationship between optimal and near-optimal solutions of combinatorial optimization problems. We show that, amongst solutions differing from the optimal solution in a small proportion delta of places, we can find near-optimal solutions whose objective function value differs from the optimum by a factor of order delta(2) but not of smaller order. We conjecture this relationship holds widely in the context of dynamic programming over random data, and Monte Carlo simulations for the Kauffman-Levin NK model are consistent with the conjecture. This work is a technical contribution to a broad program initiated in [D. J. Aldous and A. G. Percus, Proc. Natl. Acad. Sci. USA, 100 (2003), pp. 11211-11215] of relating such scaling exponents to the algorithmic difficulty of optimization problems.
The purpose of this paper is to analyze the maxima properties (value and position) of some data structures. Our theorems concern the distribution of these random variables. Previously known results usually dealt with ...
详细信息
The purpose of this paper is to analyze the maxima properties (value and position) of some data structures. Our theorems concern the distribution of these random variables. Previously known results usually dealt with the mean and sometimes the variance of the random variables. Many of our results rely on diffusion techniques. This is a very powerful tool that has already been used with some success in algorithm complexity analysis.
A polynomial time graph coloring algorithm is presented with the following property: there is a constant c > 0 such that if k=k(n) is a function such that k less than or equal to square root cn/log n, then the algo...
详细信息
A polynomial time graph coloring algorithm is presented with the following property: there is a constant c > 0 such that if k=k(n) is a function such that k less than or equal to square root cn/log n, then the algorithm colors optimally almost all graphs with n vertices and chromatic number less than or equal to k.
We consider cuckoo hashing as proposed by Pagh and Rodler in 2001. We show that the expected construction time of the hash table is O(n) as long as the two open addressing tables are each of size at least (1 + epsilon...
详细信息
We consider cuckoo hashing as proposed by Pagh and Rodler in 2001. We show that the expected construction time of the hash table is O(n) as long as the two open addressing tables are each of size at least (1 + epsilon)n, where epsilon > 0 and n is the number of data points. Slightly improved bounds are obtained for various probabilities and constraints. The analysis rests on simple properties of branching processes. (C) 2003 Elsevier Science B.V. All rights reserved.
暂无评论