Task-parallel programs often enjoy deadlock freedom under certain restrictions, such as the use of structured join operations, as in Cilk and X10, or the use of asynchronous task futures together with deadlock-avoidin...
详细信息
Various programming methods are considered. Particular attention is paid to parallel programming, quantum computers and biocomputers. This attention is due to the fact that in recent years, high-performance computing ...
详细信息
We present dynamic control replication, a run-time program analysis that enables scalable execution of implicitly parallel programs on large machines through a distributed and efficient dynamic dependence analysis. Dy...
详细信息
The study of the algorithms parallel structure is becoming increasingly important for any specialists dealing with high-performance computing. Theoretical information on this topic is included in various training cour...
详细信息
The Luhn’s algorithm is the first line of defense in various e-commerce sites and is utilized to validate credit card numbers. With increase in usage of credit cards validation process also needs to be faster. This f...
详细信息
Efforts to support high performance computing (HPC) applications' requirements in the context of cloud computing have motivated us to design HPC Shelf, a cloud computing services platform to build and deploy large...
详细信息
We are witnessing several factors in computing that offers as much opportunities as challenges. We are witnessing the end of Moore's law, almost two decades after the death of Dennard's scaling. Exascale compu...
详细信息
parallel programming is one of the most effective approaches to handle complex problems regarding time complexity by reducing computation time, by getting the most of the capacity of the processors and shared-memory o...
详细信息
Automatic synthesis of efficient scientific parallel programs for supercomputers is in general a complex problem of system parallel programming. Therefore various specialized synthesis algorithms and heuristics are of...
详细信息
In earlier work, we defined a domain-specific language (DSL) with the aim to provide an easy-to-use approach for programming multi-core and multi-GPU clusters. The DSL incorporates the idea of utilizing algorithmic sk...
详细信息
In earlier work, we defined a domain-specific language (DSL) with the aim to provide an easy-to-use approach for programming multi-core and multi-GPU clusters. The DSL incorporates the idea of utilizing algorithmic skeletons, which are well-known patterns for parallel programming, such as map and reduce. Based on the chosen skeleton, a user-defined function can be applied to a data structure in parallel with the main advantage that the user does not have to worry about implementation details. So far, we had only implemented a generator for multi-core clusters and in this paper we present and evaluate two prototypes of generators for multi-GPU clusters, which are based on OpenACC and CUDA. We have evaluated the approach with four benchmark applications. The results show that the generation approach leads to execution times, which are on par with an alternative library implementation.
暂无评论