The "direct-product code" of a function f gives its values on all k-tuples (f(x(1)), . . . , f(x(k))). This basic construct underlies "hardness amplification" in cryptography, circuit complexity, a...
详细信息
The "direct-product code" of a function f gives its values on all k-tuples (f(x(1)), . . . , f(x(k))). This basic construct underlies "hardness amplification" in cryptography, circuit complexity, and probabilistically checkable proofs (PCPs). Goldreich and Safra [SIAM J. Comput., 29 (2000), pp. 1132-1154] pioneered its local testing and its PCP application. A recent result by Dinur and Goldenberg [Proceedings of the Forty-Ninth annual IEEE symposium on Foundations of Computer Science, 2008, pp. 613-622] enabled for the first time testing proximity to this important code in the "list-decoding" regime. In particular, they give a 2-query test which works for polynomially small success probability 1/k(alpha) and show that no such test works below success probability 1/k. Our main result is a 3-query test which works for exponentially small success probability exp(-k(alpha)). Our techniques (based on recent simplified decoding algorithms for the same code [R. Impagliazzo et al., Proceedings of the Fortieth annualacmsymposium on Theory of Computing, 2008, pp. 579-588]) also allow us to considerably simplify the analysis of the 2-query test of [Proceedings of the Forty-Ninth annual IEEE symposium on Foundations of Computer Science, 2008, pp. 613-622]. We then show how to derandomize their test, achieving a code of polynomial rate, independent of k, and success probability 1/k(alpha). Finally, we show the applicability of the new tests to PCPs. Starting with a 2-query PCP with a projection property over an alphabet Sigma and with soundness error 1 - delta, Rao [Proceedings of the Fortieth annualacmsymposium on Theory of Computing, 2008, pp. 1-10] (building on Raz's (k-fold) parallel repetition theorem [R. Raz, SIAM J. Comput., 27 (1998), pp. 763-803] and Holenstein's proof [T. Holenstein, Proceedings of the Thirty-Ninth annualacmsymposium on Theory of Computing, 2007, pp. 411-419] obtains a new 2-query PCP over the alphabet Sigma(k) with soundness error exp(-delta(2)k
The availability and utility of large numbers of Graphical Processing Units (GPUs) have enabled parallel computations using extensive multi-threading. Sequential access to global memory and contention at the size-limi...
详细信息
ISBN:
(纸本)9780769546766
The availability and utility of large numbers of Graphical Processing Units (GPUs) have enabled parallel computations using extensive multi-threading. Sequential access to global memory and contention at the size-limited shared memory have been main impediments to fully exploiting potential performance in architectures having a massive number of GPUs. We propose novel memory storage and retrieval techniques that enable parallel graph computations to overcome the above issues. More specifically, given a graph G = (V, E) and an integer k <= vertical bar V vertical bar, we provide both storage techniques and algorithms to count the number of: a) connected subgraphs of size k;b) k cliques;and c) k independent sets, all of which can be exponential in number. Our storage technique is based on creating a breadth-first search tree and storing it along with non-tree edges in a novel way. The counting problems mentioned above have many uses, including the analysis of social networks.
Creating components based on concurrent and parallelalgorithms and data structures often requires more attention to "engineering" issues not seen with most other libraries. Components created in the "o...
详细信息
ISBN:
(纸本)9781450312134
Creating components based on concurrent and parallelalgorithms and data structures often requires more attention to "engineering" issues not seen with most other libraries. Components created in the "obvious" way sometimes turn out to be wrong, to perform poorly, or are unusable in most applications, because the abstractions in which they are expressed are leaky, imprecise, or incorrectly specified. While many of these issues are encountered in other aspects of concurrent programming, the impact is accentuated when they continue to leak through to APIs provided by library components. This presentation surveys some examples and lessons learned from the design, implementation, and applications of the *** library, including those surrounding memory models, resource management and garbage collection, thread management, optimization, and code generation.
The proceedings contain 89 papers. The topics discussed include: faster approximate multicommodity flow using quadratically coupled flows;when the cut condition is enough: a complete characterization for multiflow pro...
ISBN:
(纸本)9781450312455
The proceedings contain 89 papers. The topics discussed include: faster approximate multicommodity flow using quadratically coupled flows;when the cut condition is enough: a complete characterization for multiflow problems in series-parallel networks;strongly polynomial algorithm for a class of minimum-cost flow problems with separable convex objectives;quantum money from hidden subspaces;certifiable quantum dice: or, true random number generation secure against quantum adversaries;span programs for functions with constant-sized 1-certificates;linear vs. semidefinite extended formulations: exponential separation and strong lower bounds;polyhedral clinching auctions and the adwords polytope;matroid prophet inequalities;online matching with concave returns;computing a nonnegative matrix factorization - provably;and on identity testing of tensors, low-rank recovery and compressed sensing.
Quality Threshold Clustering (QTC) is an algorithm for partitioning data, in fields such as biology, where clustering of large data-sets can aid scientific discovery. Unlike other clustering algorithms, QTC does not r...
详细信息
ISBN:
(纸本)9780769546759
Quality Threshold Clustering (QTC) is an algorithm for partitioning data, in fields such as biology, where clustering of large data-sets can aid scientific discovery. Unlike other clustering algorithms, QTC does not require knowing the number of clusters a priori, however, its perceived need for high computing power often makes it an unattractive choice. This paper presents a thorough study of QTC. We analyze the worst case complexity of the algorithm and discuss methods to reduce it by trading memory for computation. We also demonstrate how the expected running time of QTC is affected by the structure of the input data. We describe how QTC can be parallelized, and discuss implementation details of our thread-parallel, GPU, and distributed memory implementations of the algorithm. We demonstrate the efficiency of our implementations through experimental data. We show how data sets with tens of thousands of elements can be clustered in a matter of minutes in a modern GPU, and seconds in a small scale cluster of multi-core CPUs, or multiple GPUs. Finally, we discuss how user selected parameters, as well as algorithmic and implementation choices, affect performance.
Next Generation Sequencing (NGS) is gaining interests due to the increased requirements and the decreased sequencing cost. The important and prerequisite step of most NGS applications is the mapping of short sequences...
详细信息
ISBN:
(纸本)9780769546766
Next Generation Sequencing (NGS) is gaining interests due to the increased requirements and the decreased sequencing cost. The important and prerequisite step of most NGS applications is the mapping of short sequences, called reads, to the template reference sequences. Both the explosion of NGS data with over billions of reads generated each day and the data-intensive computations pose great challenges to the capability of existing computing systems. In this paper, we take a hash-index based algorithm (PerM) as an example to investigate the optimization approaches for accelerating NGS reads mapping on multi-core architectures. First, we propose a new parallel algorithm that reorders bucket access in hash index among multiple threads so that data locality in shared cache is improved. Second, in order to reduce the number of empty hash bucket, we propose a serialized hash index compression algorithm, which coincides with the sequential access nature of our new parallel algorithm. With reduced hash index size, it also becomes possible for us to use longer hash keys, which alleviates the hash conflicts and improves the query performance. Our experiment on an 8-socket 8-cores Intel Xeon X7550 SMP with 128 GB memory shows that the new parallel algorithm reduces LLC miss ratio to be 8% similar to 15% of the original algorithm and the overall performance is improved by 4 similar to 11 times (6 times avg.).
We present a novel algorithm for deciding whether a given planar curve is an image of a given spatial curve, obtained by a central or a parallel projection with unknown parameters. A straightforward approach to this p...
详细信息
暂无评论