Task-parallel programming languages offer a variety of high-level mechanisms for synchronization that trade off between flexibility and deadlock safety. Some approaches are deadlock-free by construction but support li...
详细信息
Task-parallel programming languages offer a variety of high-level mechanisms for synchronization that trade off between flexibility and deadlock safety. Some approaches are deadlock-free by construction but support limited synchronization patterns, while other approaches are trivial to deadlock. In high-level task-parallel programming, it is imperative that language features offer both flexibility to avoid over-synchronization and also sufficient protection against logical deadlocks. Lack of flexibility leads to code that does not take full advantage of the available parallelism in the computation. Lack of deadlock protection leads to error-prone code in which a single bug can involve arbitrarily many tasks, making it difficult to reason about. We make advances in both flexibility and deadlock protection for existing synchronization mechanisms by carefully designing dynamically verifiable usage policies and language constructs. We first define a deadlock-freedom policy for futures. The rules of the policy follow naturally from the semantics of asynchronous task closures and correspond to a preorder traversal of the task tree. The policy admits an additional class of deadlock-free programs compared to past work. Each blocking wait for a future can be verified by a stateless, lock-free algorithm, resulting in low time and memory overheads at runtime. In order to define and identify deadlocks for promises, we introduce a mechanism for promises to be owned by tasks. Simple annotations make it possible to ensure that each promise is eventually fulfilled by the responsible task or handed off to another task. Ownership semantics allows us to formally define two kinds of promise bugs: omitted sets and deadlock cycles. We present novel detection algorithms for both bugs. We further introduce an approximate deadlock-freedom policy for promises that, instead of precisely detecting cycles, raises an alarm when synchronization dependences occurring between trees of tasks are a
ParlayLib is a C++ library for developing efficient parallel algorithms and software on shared-memory multicore machines. It provides additional tools and primitives that go beyond what is available in the C++ standar...
详细信息
DVM-system is designed for the development of parallel programs of scientific and technical calculations in C-DVMH and Fortran-DVMH languages. These languages use a single parallel programming model (DVMH model) and a...
详细信息
Today's processors become fatter, not faster. However, the exploitation of these massively parallel compute resources remains a challenge for many traditional HPC applications regarding scalability, portability an...
详细信息
Sequence alignment is a problem in bioinformatics that involves arranging sequences of proteins, RNA or DNA so that similar regions between two or more sequences may be determined. The Smith-Waterman algorithm is a ke...
详细信息
A CUDA kernel is proposed in this paper for acceleration of the computation of a dynamic hedging model. This is a very useful tool in segregated fund modelling. Current approaches delve on scenario reduction technique...
详细信息
Computer technology, which continues to develop today, often has difficulties in meeting the needs of signal and image processing software. As a result of the developing technology, software needs larger memory and fa...
详细信息
Cloud Computing has made possible flexible resources provisioning from an almost unlimited pool. This has created the opportunity to broaden the horizon of data that can be analyzed, allowing to support the so called ...
详细信息
Summary form only given. Personal computing is going mobile and applications are changing to adapt to take advantage of new opportunities offered by permanent availability and connectivity. Mobile devices are a signif...
详细信息
Summary form only given. Personal computing is going mobile and applications are changing to adapt to take advantage of new opportunities offered by permanent availability and connectivity. Mobile devices are a significant departure from traditional computing. On one hand, they are very personal, always on, always connected. They promise to fulfill the promise of being the hub for our digital lives. On the other hand, they are much more constrained in terms of resources than desktops. Even though progress in their computing capabilities has been staggering, they continue to rely on battery power and are packaged in appealing packages that are a nightmare for thermal dissipation. In this talk I will present the challenges facing programmers for mobile devices driven by architectural and packaging constraints, as well as the changes in applications domains. I will give examples on how we used concurrency to improve performance and power efficiency, in a number of projects at Qualcomm Research, including the Zoomm parallel browser.
Cryptanalysis of lattice-based cryptography is an important field in cryptography since lattice problems are among the most robust assumptions and have been used to construct a variety of cryptographic primitives. The...
详细信息
Cryptanalysis of lattice-based cryptography is an important field in cryptography since lattice problems are among the most robust assumptions and have been used to construct a variety of cryptographic primitives. The security estimation model for concrete parameters is one of the most important topics in lattice-based cryptography. In this research, we focus on the Gauss Sieve algorithm proposed by Micciancio and Voulgaris, a heuristic lattice sieving algorithm for the central lattice problem, shortest vector problem (SVP). We propose a technique of lifting computations in prime-cyclotomic ideals into that in cyclic ideals. Lifting makes rotations easier to compute and reduces the complexity of inner products from O(n(3)) to O(n(2)). We implemented the Gauss Sieve on multi-GPU systems using two layers of parallelism in our framework, and achieved up to 55 times speed of previous results of dimension 96. We were able to solve SVP on ideal lattice in dimension up to 130, which is the highest dimension SVP instance solved by sieve algorithm so far. As a result, we are able to provide a better estimate of the complexity of solving central lattice problem.
暂无评论