检索结果-内蒙古大学图书馆

Application of Hybrid MPI+TBB parallel programming Model for Traveling Salesman Problem

Application of Hybrid MPI+TBB Parallel Programming Model for...

IEEE/ACM Int'l Conference on & Int'l Conference on Cyber, Physical and Social Computing (CPSCom) Green Computing and Communications (GreenCom)

作者： Jinke Zhu Qing Li School of Computer Engineering and Science Shanghai University Shanghai China

A parallel algorithm for solving TSP(traveling salesman problem) is presented in this paper. Combining 2-opt local search optimization with genetic algorithm is the main ideal of this algorithm. In this paper, MPI+TBB hybrid parallel programming model is employed in implement of our algorithm. Numerical results indicate that it is possible to arrive at high quality solutions in reasonable time. With the increase in the scale of solving problem, the speedup of parallel algorithm is improved. Moreover, with the growth in the number of cores, the speedup of the parallel algorithm presents nearly linear growth.

关键词： Genetic algorithms Message systems parallel programming Traveling salesman problems parallel algorithms Computational modeling Educational institutions

来源：评论

学校读者我要写书评

暂无评论

Addressing Logical Deadlocks through Task-parallel Language Design

Addressing Logical Deadlocks through Task-Parallel Language ...

引用

作者： Voss, Caleb A. Georgia Institute of Technology

学位级别：博士

Task-parallel programming languages offer a variety of high-level mechanisms for synchronization that trade off between flexibility and deadlock safety. Some approaches are deadlock-free by construction but support limited synchronization patterns, while other approaches are trivial to deadlock. In high-level task-parallel programming, it is imperative that language features offer both flexibility to avoid over-synchronization and also sufficient protection against logical deadlocks. Lack of flexibility leads to code that does not take full advantage of the available parallelism in the computation. Lack of deadlock protection leads to error-prone code in which a single bug can involve arbitrarily many tasks, making it difficult to reason about. We make advances in both flexibility and deadlock protection for existing synchronization mechanisms by carefully designing dynamically verifiable usage policies and language constructs. We first define a deadlock-freedom policy for futures. The rules of the policy follow naturally from the semantics of asynchronous task closures and correspond to a preorder traversal of the task tree. The policy admits an additional class of deadlock-free programs compared to past work. Each blocking wait for a future can be verified by a stateless, lock-free algorithm, resulting in low time and memory overheads at runtime. In order to define and identify deadlocks for promises, we introduce a mechanism for promises to be owned by tasks. Simple annotations make it possible to ensure that each promise is eventually fulfilled by the responsible task or handed off to another task. Ownership semantics allows us to formally define two kinds of promise bugs: omitted sets and deadlock cycles. We present novel detection algorithms for both bugs. We further introduce an approximate deadlock-freedom policy for promises that, instead of precisely detecting cycles, raises an alarm when synchronization dependences occurring between trees of tasks are a

关键词： parallel programming Synchronization Deadlock detection Language design Runtime verification

来源：评论

学校读者我要写书评

暂无评论

Development of parallel software code for calculating the problem of radiation magnetic gas dynamics and the study of plasma dynamics in the channel of plasma accelerator 21

Development of parallel software code for calculating the pr...

引用

21st Conference on Scientific Services and Internet, SSI 2019

作者： Bakhtin, Vladimir Zakharov, Dmitry Kozlov, Andrey Konovalov, Venyamin Keldysh Institute of Applied Mathematics Miusskaya sq. 4 Moscow125047 Russia Lomonosov Moscow State University GSP-1 Leninskie Gory Moscow11999 Russia Bauman Moscow State Technical University ul. Baumanskaya 2-ya 5/1 Moscow105005 Russia

DVM-system is designed for the development of parallel programs of scientific and technical calculations in C-DVMH and Fortran-DVMH languages. These languages use a single parallel programming model (DVMH model) and are extensions of the standard C and Fortran languages with parallelism specifications, written in the form of directives to the compiler. The DVMH model makes it possible to create efficient parallel programs for heterogeneous computing clusters, in the nodes of which accelerators (graphic processors or Intel Xeon Phi coprocessors) can be used as computing devices along with universal multi-core processors. The article describes the experience of successful use of DVM-system for the development of parallel software code for calculating the problem of radiation magnetic hydrodynamics and the study of plasma dynamics in the channel of plasma accelerator. Copyright © 2020 for this paper by its authors.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

ParlayLib - A Toolkit for parallel Algorithms on Shared-Memory Multicore Machines 20

ParlayLib - A Toolkit for Parallel Algorithms on Shared-Memo...

引用

32nd ACM Symposium on parallelism in Algorithms and Architectures, SPAA 2020

作者： Blelloch, Guy E. Anderson, Daniel Dhulipala, Laxman Carnegie Mellon University PittsburghPA United States

ISBN: (纸本)9781450369350

ParlayLib is a C++ library for developing efficient parallel algorithms and software on shared-memory multicore machines. It provides additional tools and primitives that go beyond what is available in the C++ standard library, and simplifies the task of programming provably efficient and scalable parallel algorithms. It consists of a sequence data type (analogous to std::vector), many parallel routines and algorithms, a work-stealing scheduler to support nested parallelism, and a scalable memory allocator. It has been developed over a period of seven years and used in a variety of software including the PBBS benchmark suite, the Ligra, Julienne, and Aspen graph processing frameworks, the Graph Based Benchmark Suite, and the PAM library for parallel balanced binary search trees, and an implementation of the TPC-H benchmark suite. © 2020 Owner/Author.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

An Execution Time Comparison of parallel Computing Algorithms for Solving Heat Equation 3rd

An Execution Time Comparison of Parallel Computing Algorithm...

引用

3rd International Conference on Smart Applications and Data Analysis for Smart Cyber Physical Systems

作者： Belhaous, Safa Hidila, Zineb Baroud, Sohaib Chokri, Soumia Mestari, Mohammed Hassan II Univ ENSET SSDIA Lab Mohammadia Morocco

ISBN: (纸本)9783030451820;9783030451837

parallel Computing contributes significantly to most disciplines for solving several scientific problems such as partial differential equations (PDEs), load balancing, and deep learning. The primary characteristic of parallelism is its ability to ameliorate performance on many different sets of computers. Consequently, many researchers are continually expending their efforts to produce efficient parallel solutions for various problems such as heat equation. Heat equation is a natural phenomenon used in many fields like mathematics and physics. Usually, its associated model is defined by a set of partial differential equations (PDEs). This paper is primarily aimed at showing two parallel programs for solving the heat equation which has been discrete-sized using the finite difference method (FDM). These programs have been implemented through different parallel platforms such as SkelGIS and Compute Unified Device Architecture (CUDA).

关键词： parallel computing parallel programming Heat equation CUDA SkelGIS library GPU Finite difference method

来源：评论

学校读者我要写书评

暂无评论

Eventify: Event-Based Task parallelism for Strong Scaling 20

Eventify: Event-Based Task Parallelism for Strong Scaling

引用

7th Annual Platform for Advanced Scientific Computing Conference, PASC 2020

作者： Haensel, David Morgenstern, Laura Beckmann, Andreas Kabadshow, Ivo Dachsel, Holger Jülich Supercomputing Centre Jülich Germany Chemnitz University of Technology Chemnitz Germany

ISBN: (纸本)9781450379939

Today's processors become fatter, not faster. However, the exploitation of these massively parallel compute resources remains a challenge for many traditional HPC applications regarding scalability, portability and programmability. To tackle this challenge, several parallel programming approaches such as loop parallelism and task parallelism are researched in form of languages, libraries and frameworks. Task parallelism as provided by OpenMP, HPX, StarPU, Charm++ and Kokkos is the most promising approach to overcome the challenges of ever increasing parallelism. The aforementioned parallel programming technologies enable scalability for a broad range of algorithms with coarse-grained tasks, e. g. in linear algebra and classical N-body simulation. However, they do not fully address the performance bottlenecks of algorithms with fine-grained tasks and the resultant large task graphs. Additionally, we experienced the description of large task graphs to be cumbersome with the common approach of providing in-, out-and inout-dependencies. We introduce event-based task parallelism to solve the performance and programmability issues for algorithms that exhibit fine-grained task parallelism and contain repetitive task patterns. With user-defined event lists, the approach provides a more convenient and compact way to describe large task graphs. Furthermore, we show how these event lists are processed by a task engine that reuses user-defined, algorithmic data structures. As use case, we describe the implementation of a fast multipole method for molecular dynamics with event-based task parallelism. The performance analysis reveals that the event-based implementation is 52 % faster than a classical loop-parallel implementation with OpenMP. © 2020 ACM.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Time Improvement of Smith-Waterman Algorithm Using OpenMP and SIMD 2nd

Time Improvement of Smith-Waterman Algorithm Using OpenMP an...

引用

2nd International Conference on Futuristic Trends in Networks and Computing Technologiess, FTNCT 2019

作者： Malik, Mehak Malhotra, Srijan Prasanth, Narayanan School of CSE Vellore Institute of Technology VelloreTamil Nadu India

ISBN: (纸本)9789811544507

Sequence alignment is a problem in bioinformatics that involves arranging sequences of proteins, RNA or DNA so that similar regions between two or more sequences may be determined. The Smith-Waterman algorithm is a key algorithm for aligning sequences. This paper uses the OpenMP application-programming interface along with the Single-Instruction Multiple-Data (SIMD) instructions. Advanced Vector Instructions 2 (AVX2) is used to implement the SIMD paradigm. It utilizes both fine-level and coarse-level parallelism to improve resource utilization without requiring support from multiple nodes in a distributed memory system. The algorithm shows a multifold decrease in execution time in comparison to an implementation that is sequentially executed. © 2020, Springer Nature Singapore Pte Ltd.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

A CUDA approach for scenario reduction in hedging models 5th

A CUDA approach for scenario reduction in hedging models

引用

5th International Conference on Advanced Engineering Theory and Applications, AETA 2018

作者： Davendra, Donald Chueh, Chin-mei Hamel, Emmanuel Department of Computer Science Central Washington University 400 E University Way EllensburgWA98926 United States Department of Mathematics Central Washington University 400 E University Way EllensburgWA98926 United States Autorité des marchés financiers QuébecQC Canada

ISBN: (纸本)9783030149062

A CUDA kernel is proposed in this paper for acceleration of the computation of a dynamic hedging model. This is a very useful tool in segregated fund modelling. Current approaches delve on scenario reduction techniques in order to extract meaningful information from a large data set. parallel programming allows these models to be effectively evaluated within a critical time frame. The GPU execution times shows significant improvement over CPU approaches. © Springer Nature Switzerland AG 2020.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Solving Twisty Puzzles Using parallel Q-learning

引用

ENGINEERING LETTERS 2021年第4期29卷 1535页

作者： Hukmani, Kavish Kolekar, Sucheta Vobugari, Sreekumar Manipal Acad Higher Educ Manipal Inst Technol Dept Informat & Commun Technol Manipal 576104 Karnataka India Infosys Ltd Hyderabad 500088 India

There has been a recent trend of teaching agents to solve puzzles and play games using Deep Reinforcement Learning (DRL) which was brought by the success of AlphaGo. While this method has given some truly groundbreaking results and it is very computationally intensive. This paper evaluates the feasibility of solving Combinatorial Optimization Problems such as Twisty Puzzles using parallel Q-Learning (PQL). We propose a method using Constant Share-Reinforcement Learning (CS-RL) as a more resource optimized approach and measure the impact of sub-goals built using human knowledge. We attempt to solve three puzzles, the 2x2x2 Pocket Rubik's Cube, the Skewb and the Pyraminx with and without sub-goals based on popular solving methods used by humans and compare their results. Our agents are able to solve these puzzles with a 100% success rate by just a few hours of training, much better than previous DRL based agents that require large computational time. Further, the proposed approach is compared with Deep Learning based solution for 2x2x2 Rubik's Cube and observed higher success rate.

关键词： parallel programming Q-learning Reinforcement Learning Twisty Puzzles Rubik's Cube Agent-based programming

来源：评论

学校读者我要写书评

暂无评论

Approach for Accelerating Image Enhancement Processes: Optimized OpenCL Architecture 4

Approach for Accelerating Image Enhancement Processes: Optim...

引用

4th International Symposium on Multidisciplinary Studies and Innovative Technologies, ISMSIT 2020

作者： Aslan, Menduh Furkan Goksu, Tuna Antalya Bilim Üniversitesi Elektrik-Elektronik Mühendisliǧi Bölümü Antalya Turkey Isparta Uygulamalı Bilimler Üniversitesi Elektrik-Elektronik Mühendisliǧi Bölümü Antalya Turkey

ISBN: (纸本)9781728190907

Computer technology, which continues to develop today, often has difficulties in meeting the needs of signal and image processing software. As a result of the developing technology, software needs larger memory and faster processor. parallel programming method has been developed to solve the speed problems of processors. In this study, OpenCL based image enhancement applications that can work in parallel on the graphics processor unit have been implemented.. The OpenCL architecture has been optimized to maximize the amount of acceleration. Appropriate image enhancement applications have been tested to observe that the designed algorithm and architecture are successful in simple or complex operations. In order to make sense of the speed gain, the same applications were developed with serial programming technique and the results obtained were compared with the applications developed in parallel. It is supported by the comparison results that parallel programming is better in terms of performance. Due to the parallel programming for the hardware used, it was observed that the calculation times were reduced by 1.58 times to 561 times. © 2020 IEEE.

关键词： Heterogeneous programming OpenCL parallel programming Performance Analysis

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：