We consider the problem of efficiently designing sets (codes) of equal-length DNA strings (words) that satisfy certain combinatorial constraints. This problem has numerous motivations including DNA self-assembly and D...
详细信息
We consider the problem of efficiently designing sets (codes) of equal-length DNA strings (words) that satisfy certain combinatorial constraints. This problem has numerous motivations including DNA self-assembly and DNA computing. Previous work has extended results from coding theory to obtain bounds on code size for new biologically motivated constraints and has applied heuristic local search and genetic algorithm techniques for code design. This article proposes a natural optimization formulation of the DNA code design problem in which the goal is to design n strings that satisfy a given set of constraints while minimizing the length of the strings. For multiple sets of constraints, we provide simple randomized algorithms that run in time polynomial in n and any given constraint parameters, and output strings of length within a constant factor of the optimal with high probability. To the best of our knowledge, this work is the first to consider this type of optimization problem in the context of DNA code design.
The Kaczmarz and Gauss-Seidel methods both solve a linear system X beta - y by iteratively refining the solution estimate. Recent interest in these methods has been sparked by a proof of Strohmer and Vershynin which s...
详细信息
The Kaczmarz and Gauss-Seidel methods both solve a linear system X beta - y by iteratively refining the solution estimate. Recent interest in these methods has been sparked by a proof of Strohmer and Vershynin which shows the randomized Kaczmarz method converges linearly in expectation to the solution. Lewis and Leventhal then proved a similar result for the randomized Gauss-Seidel algorithm. However, the behavior of both methods depends heavily on whether the system is underdetermined or overdetermined, and whether it is consistent or not. Here we provide a unified theory of both methods, their variants for these different settings, and draw connections between both approaches. In doing so, we also provide a proof that an extended version of randomized Gauss-Seidel converges linearly to the least norm solution in the underdetermined case (where the usual randomized Gauss-Seidel fails to converge). We detail analytically and empirically the convergence properties of both methods and their extended variants in all possible system settings. With this result, a complete and rigorous theory of both methods is furnished.
The generalized singular value decomposition (GSVD) is one of the essential tools in numerical linear algebra. This paper proposes a regularization method, combining Tikhonov regularization in general form with the tr...
详细信息
The generalized singular value decomposition (GSVD) is one of the essential tools in numerical linear algebra. This paper proposes a regularization method, combining Tikhonov regularization in general form with the truncated GSVD. Then the randomized algorithms are adopted to implement the truncation process. This randomized GSVD for the regularization of the large-scale ill-posed problems can achieve good accuracy with less computational time and memory requirement than the classical regularization methods. Finally, we present the error analyses for the randomized algorithms. Some illustrative numerical examples are provided.
We present a randomized parallel algorithm that computes the greatest common divisor of two integers of n bits in length with probability 1-o(1) that takes O(n log log n/log n) time using O (n(6+epsilon)) processors f...
详细信息
We present a randomized parallel algorithm that computes the greatest common divisor of two integers of n bits in length with probability 1-o(1) that takes O(n log log n/log n) time using O (n(6+epsilon)) processors for any epsilon > 0 on the EREW PRAM parallel model of computation. The algorithm either gives a correct answer or reports failure. We believe this to be the first randomized sublinear time algorithm on the EREW PRAM for this problem. (C) 2009 Elsevier B.V. All rights reserved.
We revisit the classic problem of spreading a piece of information in a group of fully connected processors. By suitably adding a small dose of randomness to the protocol of Gasieniec and Pelc (Parallel Comput 22:903-...
详细信息
We revisit the classic problem of spreading a piece of information in a group of fully connected processors. By suitably adding a small dose of randomness to the protocol of Gasieniec and Pelc (Parallel Comput 22:903-912, 1996), we derive for the first time protocols that (i) use a linear number of messages, (ii) are correct even when an arbitrary number of adversarially chosen processors does not participate in the process, and (iii) with high probability have the asymptotically optimal runtime of when at least an arbitrarily small constant fraction of the processors are working. In addition, our protocols do not require that the system is synchronized nor that all processors are simultaneously woken up at time zero, they are fully based on push-operations, and they do not need an a priori estimate on the number of failed nodes. Our protocols thus overcome the typical disadvantages of the two known approaches, algorithms based on random gossip (typically needing a large number of messages due to their unorganized nature) and algorithms based on fair workload splitting (which are either not time-efficient or require intricate preprocessing steps plus synchronization).
This paper studies the inherent trade-off between termination probability and total step complexity of randomized consensus algorithms. It shows that for every integer k, the probability that an f-resilient randomized...
详细信息
This paper studies the inherent trade-off between termination probability and total step complexity of randomized consensus algorithms. It shows that for every integer k, the probability that an f-resilient randomized consensus algorithm of n processes does not terminate with agreement within k(n-f) steps is at least 1/c(k), for some constant c. A corresponding result is proved for Monte-Carlo algorithms that may terminate in disagreement. The lower bound holds for asynchronous systems, where processes communicate either by message passing or through shared memory, under a very weak adversary that determines the schedule in advance, without observing the algorithm's actions. This complements algorithms of Kapron et al. [Proceedings of the Nineteenth Annual ACM-SIAM Symposium on Discrete algorithms (SODA), ACM, New York, SIAM, Philadelphia, 2008, pp. 1038-1047] for message-passing systems, and of Aumann [Proceedings of the 16th Annual ACM Symposium on Principles of Distributed Computing (PODC), ACM, New York, 1997, pp. 209-218] and Aumann and Bender [Distrib. Comput., 17 (2005), pp. 191-207] for shared-memory systems.
When randomized ensembles such as bagging or random forests are used for binary classification, the prediction error of the ensemble tends to decrease and stabilize as the number of classifiers increases. However, the...
详细信息
When randomized ensembles such as bagging or random forests are used for binary classification, the prediction error of the ensemble tends to decrease and stabilize as the number of classifiers increases. However, the precise relationship between prediction error and ensemble size is unknown in practice. In the standard case when classifiers are aggregated by majority vote, the present work offers a way to quantify this convergence in terms of "algorithmic variance," i.e. the variance of prediction error due only to the randomized training algorithm. Specifically, we study a theoretical upper bound on this variance, and show that it is sharp - in the sense that it is attained by a specific family of randomized classifiers. Next, we address the problem of estimating the unknown value of the bound, which leads to a unique twist on the classical problem of non-parametric density estimation. In particular, we develop an estimator for the bound and show that its MSE matches optimal non-parametric rates under certain conditions. (Concurrent with this work, some closely related results have also been considered in Cannings and Samworth (2017) and Lopes (2019).) (C) 2019 Elsevier B.V. All rights reserved.
This paper investigates the randomized version of the Kaczmarz method to solve linear systems in the case where the adjoint of the system matrix is not exacta situation we refer to as mismatched adjoint. We show that ...
详细信息
This paper investigates the randomized version of the Kaczmarz method to solve linear systems in the case where the adjoint of the system matrix is not exacta situation we refer to as mismatched adjoint. We show that the method may still converge both in the over- and underdetermined consistent case under appropriate conditions, and we calculate the expected asymptotic rate of linear convergence. Moreover, we analyze the inconsistent case and obtain results for the method with mismatched adjoint as for the standard method. Finally, we derive a method to compute optimized probabilities for the choice of the rows and illustrate our findings with numerical examples.
We study on-line scheduling in overloaded systems. Requests for jobs arrive one by one as time proceeds;the serving agents have limited capacity and not all requests can be served. Still, we want to serve the "be...
详细信息
We study on-line scheduling in overloaded systems. Requests for jobs arrive one by one as time proceeds;the serving agents have limited capacity and not all requests can be served. Still, we want to serve the "best" set of requests according to some criterion. In this situation, the ability to preempt (i.e., abort) jobs in service in order to make room for better jobs that would otherwise be rejected has proven to be of great help in some scenarios. We show that, surprisingly, in many other scenarios this is not the case. In a simple, generic model, we prove a polylogarithmic lower bound on the competitiveness of randomized and preemptive on-line scheduling algorithms. Our bound applies to several recently studied problems. In fact, in certain scenarios our bound is quite close to the competitiveness achieved by known deterministic, nonpreemptive algorithms.
All known fast randomized Byzantine Agreement (BA) protocols have (rare) infinite runs. We present a method of combining a randomized BA protocol of a certain class with any deterministic BA protocol to obtain a rando...
详细信息
All known fast randomized Byzantine Agreement (BA) protocols have (rare) infinite runs. We present a method of combining a randomized BA protocol of a certain class with any deterministic BA protocol to obtain a randomized protocol that preserves the expected average complexity of the randomized protocol while guaranteeing termination in all runs. In particular, we obtain a randomized BA protocol with constant expected time, which always terminates within t + O(log t ) rounds, where t = O( n ) is the number of faulty processors.
暂无评论