检索结果-内蒙古大学图书馆

作者： Voss, Caleb A. Georgia Institute of Technology

学位级别：博士

Task-parallel programming languages offer a variety of high-level mechanisms for synchronization that trade off between flexibility and deadlock safety. Some approaches are deadlock-free by construction but support limited synchronization patterns, while other approaches are trivial to deadlock. In high-level task-parallel programming, it is imperative that language features offer both flexibility to avoid over-synchronization and also sufficient protection against logical deadlocks. Lack of flexibility leads to code that does not take full advantage of the available parallelism in the computation. Lack of deadlock protection leads to error-prone code in which a single bug can involve arbitrarily many tasks, making it difficult to reason about. We make advances in both flexibility and deadlock protection for existing synchronization mechanisms by carefully designing dynamically verifiable usage policies and language constructs. We first define a deadlock-freedom policy for futures. The rules of the policy follow naturally from the semantics of asynchronous task closures and correspond to a preorder traversal of the task tree. The policy admits an additional class of deadlock-free programs compared to past work. Each blocking wait for a future can be verified by a stateless, lock-free algorithm, resulting in low time and memory overheads at runtime. In order to define and identify deadlocks for promises, we introduce a mechanism for promises to be owned by tasks. Simple annotations make it possible to ensure that each promise is eventually fulfilled by the responsible task or handed off to another task. Ownership semantics allows us to formally define two kinds of promise bugs: omitted sets and deadlock cycles. We present novel detection algorithms for both bugs. We further introduce an approximate deadlock-freedom policy for promises that, instead of precisely detecting cycles, raises an alarm when synchronization dependences occurring between trees of tasks are a

关键词： parallel programming Synchronization Deadlock detection Language design Runtime verification

来源：评论

学校读者我要写书评

暂无评论

ParlayLib - A Toolkit for parallel Algorithms on Shared-Memory Multicore Machines 20

ParlayLib - A Toolkit for Parallel Algorithms on Shared-Memo...

引用

32nd ACM Symposium on parallelism in Algorithms and Architectures, SPAA 2020

作者： Blelloch, Guy E. Anderson, Daniel Dhulipala, Laxman Carnegie Mellon University PittsburghPA United States

ISBN: (纸本)9781450369350

ParlayLib is a C++ library for developing efficient parallel algorithms and software on shared-memory multicore machines. It provides additional tools and primitives that go beyond what is available in the C++ standard library, and simplifies the task of programming provably efficient and scalable parallel algorithms. It consists of a sequence data type (analogous to std::vector), many parallel routines and algorithms, a work-stealing scheduler to support nested parallelism, and a scalable memory allocator. It has been developed over a period of seven years and used in a variety of software including the PBBS benchmark suite, the Ligra, Julienne, and Aspen graph processing frameworks, the Graph Based Benchmark Suite, and the PAM library for parallel balanced binary search trees, and an implementation of the TPC-H benchmark suite. © 2020 Owner/Author.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Development of parallel software code for calculating the problem of radiation magnetic gas dynamics and the study of plasma dynamics in the channel of plasma accelerator 21

Development of parallel software code for calculating the pr...

引用

21st Conference on Scientific Services and Internet, SSI 2019

作者： Bakhtin, Vladimir Zakharov, Dmitry Kozlov, Andrey Konovalov, Venyamin Keldysh Institute of Applied Mathematics Miusskaya sq. 4 Moscow125047 Russia Lomonosov Moscow State University GSP-1 Leninskie Gory Moscow11999 Russia Bauman Moscow State Technical University ul. Baumanskaya 2-ya 5/1 Moscow105005 Russia

DVM-system is designed for the development of parallel programs of scientific and technical calculations in C-DVMH and Fortran-DVMH languages. These languages use a single parallel programming model (DVMH model) and are extensions of the standard C and Fortran languages with parallelism specifications, written in the form of directives to the compiler. The DVMH model makes it possible to create efficient parallel programs for heterogeneous computing clusters, in the nodes of which accelerators (graphic processors or Intel Xeon Phi coprocessors) can be used as computing devices along with universal multi-core processors. The article describes the experience of successful use of DVM-system for the development of parallel software code for calculating the problem of radiation magnetic hydrodynamics and the study of plasma dynamics in the channel of plasma accelerator. Copyright © 2020 for this paper by its authors.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Eventify: Event-Based Task parallelism for Strong Scaling 20

Eventify: Event-Based Task Parallelism for Strong Scaling

引用

7th Annual Platform for Advanced Scientific Computing Conference, PASC 2020

作者： Haensel, David Morgenstern, Laura Beckmann, Andreas Kabadshow, Ivo Dachsel, Holger Jülich Supercomputing Centre Jülich Germany Chemnitz University of Technology Chemnitz Germany

ISBN: (纸本)9781450379939

Today's processors become fatter, not faster. However, the exploitation of these massively parallel compute resources remains a challenge for many traditional HPC applications regarding scalability, portability and programmability. To tackle this challenge, several parallel programming approaches such as loop parallelism and task parallelism are researched in form of languages, libraries and frameworks. Task parallelism as provided by OpenMP, HPX, StarPU, Charm++ and Kokkos is the most promising approach to overcome the challenges of ever increasing parallelism. The aforementioned parallel programming technologies enable scalability for a broad range of algorithms with coarse-grained tasks, e. g. in linear algebra and classical N-body simulation. However, they do not fully address the performance bottlenecks of algorithms with fine-grained tasks and the resultant large task graphs. Additionally, we experienced the description of large task graphs to be cumbersome with the common approach of providing in-, out-and inout-dependencies. We introduce event-based task parallelism to solve the performance and programmability issues for algorithms that exhibit fine-grained task parallelism and contain repetitive task patterns. With user-defined event lists, the approach provides a more convenient and compact way to describe large task graphs. Furthermore, we show how these event lists are processed by a task engine that reuses user-defined, algorithmic data structures. As use case, we describe the implementation of a fast multipole method for molecular dynamics with event-based task parallelism. The performance analysis reveals that the event-based implementation is 52 % faster than a classical loop-parallel implementation with OpenMP. © 2020 ACM.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Time Improvement of Smith-Waterman Algorithm Using OpenMP and SIMD 2nd

Time Improvement of Smith-Waterman Algorithm Using OpenMP an...

引用

2nd International Conference on Futuristic Trends in Networks and Computing Technologiess, FTNCT 2019

作者： Malik, Mehak Malhotra, Srijan Prasanth, Narayanan School of CSE Vellore Institute of Technology VelloreTamil Nadu India

ISBN: (纸本)9789811544507

Sequence alignment is a problem in bioinformatics that involves arranging sequences of proteins, RNA or DNA so that similar regions between two or more sequences may be determined. The Smith-Waterman algorithm is a key algorithm for aligning sequences. This paper uses the OpenMP application-programming interface along with the Single-Instruction Multiple-Data (SIMD) instructions. Advanced Vector Instructions 2 (AVX2) is used to implement the SIMD paradigm. It utilizes both fine-level and coarse-level parallelism to improve resource utilization without requiring support from multiple nodes in a distributed memory system. The algorithm shows a multifold decrease in execution time in comparison to an implementation that is sequentially executed. © 2020, Springer Nature Singapore Pte Ltd.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

A CUDA approach for scenario reduction in hedging models 5th

A CUDA approach for scenario reduction in hedging models

引用

5th International Conference on Advanced Engineering Theory and Applications, AETA 2018

作者： Davendra, Donald Chueh, Chin-mei Hamel, Emmanuel Department of Computer Science Central Washington University 400 E University Way EllensburgWA98926 United States Department of Mathematics Central Washington University 400 E University Way EllensburgWA98926 United States Autorité des marchés financiers QuébecQC Canada

ISBN: (纸本)9783030149062

A CUDA kernel is proposed in this paper for acceleration of the computation of a dynamic hedging model. This is a very useful tool in segregated fund modelling. Current approaches delve on scenario reduction techniques in order to extract meaningful information from a large data set. parallel programming allows these models to be effectively evaluated within a critical time frame. The GPU execution times shows significant improvement over CPU approaches. © Springer Nature Switzerland AG 2020.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Approach for Accelerating Image Enhancement Processes: Optimized OpenCL Architecture 4

Approach for Accelerating Image Enhancement Processes: Optim...

引用

4th International Symposium on Multidisciplinary Studies and Innovative Technologies, ISMSIT 2020

作者： Aslan, Menduh Furkan Goksu, Tuna Antalya Bilim Üniversitesi Elektrik-Elektronik Mühendisliǧi Bölümü Antalya Turkey Isparta Uygulamalı Bilimler Üniversitesi Elektrik-Elektronik Mühendisliǧi Bölümü Antalya Turkey

ISBN: (纸本)9781728190907

Computer technology, which continues to develop today, often has difficulties in meeting the needs of signal and image processing software. As a result of the developing technology, software needs larger memory and faster processor. parallel programming method has been developed to solve the speed problems of processors. In this study, OpenCL based image enhancement applications that can work in parallel on the graphics processor unit have been implemented.. The OpenCL architecture has been optimized to maximize the amount of acceleration. Appropriate image enhancement applications have been tested to observe that the designed algorithm and architecture are successful in simple or complex operations. In order to make sense of the speed gain, the same applications were developed with serial programming technique and the results obtained were compared with the applications developed in parallel. It is supported by the comparison results that parallel programming is better in terms of performance. Due to the parallel programming for the hardware used, it was observed that the calculation times were reduced by 1.58 times to 561 times. © 2020 IEEE.

关键词： Heterogeneous programming OpenCL parallel programming Performance Analysis

来源：评论

学校读者我要写书评

暂无评论

Map-reduce process algebra: a formalism to describe directed acyclic graph task-based jobs in parallel environments 1

引用

25th International Conference on Analytical and Stochastic Modelling Techniques and Applications, ASMTA 2019

作者： Barbierato, Enrico Gribaudo, Marco Iacono, Mauro Università Cattolica del Sacro Cuore via dei Musei 41 Brescia25121 Italy Dipartimento di Elettronica Informazione e Bioingegneria Politecnico di Milano via Ponzio 345 Milan20133 Italy Dipartimento di Matematica e Fisica Università degli Studi della Campania "L. Vanvitelli" viale Lincoln 5 Caserta81100 Italy

ISBN: (数字)9783030628857

ISBN: (纸本)9783030628840

Cloud Computing has made possible flexible resources provisioning from an almost unlimited pool. This has created the opportunity to broaden the horizon of data that can be analyzed, allowing to support the so called Big Data Analytics applications. New programming paradigms, such as NoSQL queries and Map-Reduce applications, have emerged within frameworks such as Microsoft Azure, Hadoop and Apache Spark. In many cases, applications execute jobs that are split into stages, each one composed of tasks that can be run in parallel on many computational nodes. Directed acyclic graphs describe the precedence between stages, defining the execution rules and controlling the degree of parallelism. This work presents a Process Algebra dialect aimed at describing both jobs and execution environments. The proposed framework is then used to model and study standard parallel programming benchmarks, to demonstrate its applicability. © 2020, Springer Nature Switzerland AG.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Keynote talk: parallel programming for mobile computing

Keynote talk: Parallel programming for mobile computing

引用

International Conference on parallel Architecture and Compilation Techniques (PACT)

作者： Călin Caşcaval Qualcomm Research Silicon Valley USA

Summary form only given. Personal computing is going mobile and applications are changing to adapt to take advantage of new opportunities offered by permanent availability and connectivity. Mobile devices are a significant departure from traditional computing. On one hand, they are very personal, always on, always connected. They promise to fulfill the promise of being the hub for our digital lives. On the other hand, they are much more constrained in terms of resources than desktops. Even though progress in their computing capabilities has been staggering, they continue to rely on battery power and are packaged in appealing packages that are a nightmare for thermal dissipation. In this talk I will present the challenges facing programmers for mobile devices driven by architectural and packaging constraints, as well as the changes in applications domains. I will give examples on how we used concurrency to improve performance and power efficiency, in a number of projects at Qualcomm Research, including the Zoomm parallel browser.

关键词： parallel programming Program processors Mobile computing Silicon Mobile handsets Abstracts Mobile communication

来源：评论

学校读者我要写书评

暂无评论

parallelization on Gauss Sieve Algorithm over Ideal Lattice

引用

JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 2021年第5期37卷 1187-1209页

作者： Kuo, Po-Chun Cheng, Chen-Mou LI, Wen-Ding Yang, Bo-Yin Natl Taiwan Univ Dept Elect Engn Taipei 106 Taiwan Kanazawa Univ Grad Sch Nat Sci & Technol Kanazawa Ishikawa 9201192 Japan Acad Sinica Inst Informat Sci Taipei 115 Taiwan

Cryptanalysis of lattice-based cryptography is an important field in cryptography since lattice problems are among the most robust assumptions and have been used to construct a variety of cryptographic primitives. The security estimation model for concrete parameters is one of the most important topics in lattice-based cryptography. In this research, we focus on the Gauss Sieve algorithm proposed by Micciancio and Voulgaris, a heuristic lattice sieving algorithm for the central lattice problem, shortest vector problem (SVP). We propose a technique of lifting computations in prime-cyclotomic ideals into that in cyclic ideals. Lifting makes rotations easier to compute and reduces the complexity of inner products from O(n(3)) to O(n(2)). We implemented the Gauss Sieve on multi-GPU systems using two layers of parallelism in our framework, and achieved up to 55 times speed of previous results of dimension 96. We were able to solve SVP on ideal lattice in dimension up to 130, which is the highest dimension SVP instance solved by sieve algorithm so far. As a result, we are able to provide a better estimate of the complexity of solving central lattice problem.

关键词： cryptography parallel programming lattice-based cryptography sieving algorithm gauss sieve GPU shortest vector problem ideal lattices

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：