检索结果-内蒙古大学图书馆

arXiv 2022年

作者： Wang, Anjia Yi, Xinyao Yan, Yonghong University of North Carolina at Charlotte CharlotteNC United States

The complexity of heterogeneous computing architectures, as well as the demand for productive and portable parallel application development, have driven the evolution of parallel programming models to become more comprehensive and complex than before. Enhancing the conventional compilation technologies and software infrastructure to be parallelism-aware has become one of the main goals of recent compiler development. In this paper, we propose the design of unified parallel intermediate representation (UPIR) for multiple parallel programming models and for enabling unified compiler transformation for the models. UPIR specifies three commonly used parallelism patterns (SPMD, data and task parallelism), data attributes and explicit data movement and memory management, and synchronization operations used in parallel programming. We demonstrate UPIR via a prototype implementation in the ROSE compiler for unifying IR for both OpenMP and OpenACC and in both C/C++ and Fortran, for unifying the transformation that lowers both OpenMP and OpenACC code to LLVM runtime, and for exporting UPIR to LLVM MLIR dialect. Copyright © 2022, The Authors. All rights reserved.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Exact Solution of Resource Leveling Problem by Exhaustive Enumeration with parallel programming

TEKNIK DERGI

引用

TEKNIK DERGI 2021年第3期32卷 10767-10805页

作者： Erzurum, Tugba Bettemir, Onder Halis Inonu Univ Insaat Muhendisligi Bolumu Malatya Turkey

Resource Leveling Problem (RLP) is solved by heuristic, meta-heuristic, and mathematical methods. However, the aforementioned methods cannot guarantee the exact solution for large size problems. In this study, number of feasible schedules which can be obtained by delaying the non-critical activities without violating the precedence relationships and elongating the project completion time are computed. All of the feasible schedules which can be defined as the search domain are enumerated and the guaranteed optimum solution for the RLP is obtained by a different method from the existing methods. Exponential equation between the search domain and the number of activities on serial path is derived and the insolvability of large RLP in a reasonable time by one central processing unit is verified. Partitioning of the problem into equal sizes is provided by parallel programming so that each particle contains the same number of enumeration. In this study, four RLP in which the largest problem has 36 activities are solved by exhaustive enumeration within reasonable solution time and it is proved that the proposed method is applicable. Exact solutions of larger problems can also be obtained by the proposed method if the problem is partitioned into smaller sizes.

关键词： Resource leveling problem optimization critical path method parallel programming

来源：评论

学校读者我要写书评

暂无评论

Towards Generic parallel programming in Computer Science Education with Kokkos

Towards Generic Parallel Programming in Computer Science Edu...

引用

Workshop on Education for High Performance Computing (EduHPC)

作者： Ciesko, Jan Poliakoff, David Hollman, Daisy S. Trott, Christian C. Lebrun-Grandie, Damien Sandia Natl Labs Comp Sci Res Inst POB 5800 Albuquerque NM 87185 USA Sandia Natl Labs Comp Sci Res Inst Livermore CA USA Oak Ridge Natl Lab Computat Sci & Engn Oak Ridge TN USA

ISBN: (纸本)9780738143057

parallel patterns, views, and spaces are promising abstractions to capture the programmer's intent as well as the contextual information that can be used by an underlying runtime to efficiently map software to parallel hardware. These abstractions can be valuable in cases where an algorithm must accommodate requirements of code and performance portability across hardware architectures and vendor programming models. Kokkos is a parallel programming model for host- and accelerator architectures that relies on these abstractions and targets these requirements. It consists of a pure C++ interface, a specification, and a programming library. The programming library exposes patterns and types and maps them to an underlying abstract machine model. The abstract machine model offers a generic view of parallel hardware. While Kokkos is gaining popularity in large-scale HPC applications at some DOE laboratories, we believe that the implemented concepts are of interest to a broader audience including academia as they may contribute to a generic, vendor, and architecture-independent education of parallel programming. In this work, we give an insight into the design considerations of this programming model and list important abstractions. Further, we document best practices obtained from giving virtual classes on Kokkos and give pointers to resources that the reader may consider valuable for a lecture on generic parallel programming for students with preexisting knowledge on this matter.

关键词： parallel programming Kokkos C plus

来源：评论

学校读者我要写书评

暂无评论

Embedded cluster platform for a remote parallel programming lab 11

Embedded cluster platform for a remote parallel programming ...

引用

IEEE Global Engineering Education Conference (IEEE EDUCON)

作者： Velasquez, Ricardo A. Isaza, Sebastian Montoya, Emanuel Garcia, Luis German Gomez, Jonathan Univ Antioquia Fac Engn Medellin Colombia

ISBN: (纸本)9781728109305

Single-board computers have recently grown to offer developers a wide range of options where the common denominators are low power and low cost. In this paper, we present an embedded cluster platform for a remote parallel programming lab to be used in an online course. A remote lab server handles all requests coming from the front-end running on an online learning platform and controls the execution of the parallel programming assignments submitted by students. The embedded cluster where the jobs run is made out of single-board computers connected through a gigabit network among them and to the lab server. In our first working prototype, we have tested six different state-of-the-art single-board computers, evaluating their processing latency, price, and tools compatibility. We found that the Vim3Pro performed best overall, being the fastest in most tests, having a mid-range price, and being only two times slower than a much more expensive high-end Xeon processor when using the same amount of cores.

关键词： Embedded cluster single-board computer parallel programming remote laboratory online learning

来源：评论

学校读者我要写书评

暂无评论

Using parallel programming Models for Automotive Workloads on Heterogeneous Systems - a Case Study 28

Using Parallel Programming Models for Automotive Workloads o...

引用

28th Euromicro International Conference on parallel, Distributed and Network-Based Processing (PDP)

作者： Sommer, Lukas Stock, Florian Solis-Vasquez, Leonardo Koch, Andreas Tech Univ Darmstadt Embedded Syst & Applicat Grp Darmstadt Germany

ISBN: (纸本)9781728165820

Due to the ever-increasing computational demand of automotive applications, and in particular autonomous driving functionalities, the automotive industry and supply vendors are starling to adopt parallel and heterogeneous embedded platforms for their products. However, C and C++, the currently dominating programming languages in this industry, do not provide sufficient mechanisms to target such platforms. Established parallel programming models such as OpenMP and OpenCI, on the other hand are tailored towards HPC systems. In this case study, we investigate the applicability of established parallel programming models to automotive workloads on heterogeneous platforms. We pursue a practical approach by re-enacting a typical development process for typical embedded platforms and representative benchmarks.

关键词： embedded automotive parallel programming heterogeneous OpenMP OpenCL CUDA

来源：评论

学校读者我要写书评

暂无评论

EASYPAP: a Framework for Learning parallel programming 34

EASYPAP: a Framework for Learning Parallel Programming

引用

34th IEEE International parallel and Distributed Processing Symposium (IPDPS)

作者： Lasserre, Alice Namyst, Raymond Wacrenier, Pierre-Andre Univ Bordeaux Comp Sci Dept Inria Bordeaux Sud Ouest Talence France

ISBN: (纸本)9781728174457

This paper presents EASYPAP, an easy-to-use programming environment designed to help students to learn parallel programming. EASYPAP features a wide range of 2D computation kernels that the students are invited to parallelize using Pthreads, OpenMP, OpenCL or MPI. Execution of kernels can be interactively visualized, and powerful monitoring tools allow students to observe both the scheduling of computations and the assignment of 2D tiles to threads/processes. By focusing on algorithms and data distribution, students can experiment with diverse code variants and tune multiple parameters, resulting in richer problem exploration and faster progress towards efficient solutions. We present selected lab assignments which illustrate how EASYPAP improves the way students explore parallel programming.

关键词： parallel programming visualization monitoring education OpenMP MPI

来源：评论

学校读者我要写书评

暂无评论

Advances in parallel programming for electronic design automation

Advances in parallel programming for electronic design autom...

引用

作者： Lin, Chun-Xun University of Illinois – Urbana-Champaign

学位级别：博士

The continued miniaturization of the technology node increases not only the chip capacity but also the circuit design complexity. How does one efficiently design a chip with millions or billions transistors? This has become a challenging problem in the integrated circuit (IC) design industry, especially for the developers of electronic design automation (EDA) tools. To boost the performance of EDA tools, one promising direction is via parallel computing. In this dissertation, we explore different parallel computing approaches, from CPU to GPU to distributed computing, for EDA applications. Nowadays multi-core processors are prevalent from mobile devices to laptops to desktop, and it is natural for software developers to utilize the available cores to maximize the performance of their applications. Therefore, in this dissertation we first focus on multi-threaded programming. We begin by reviewing a C++ parallel programming library called Cpp-Taskflow. Cpp-Taskflow is designed to facilitate programming parallel applications, and has been successfully applied to an EDA timing analysis tool. We will demonstrate Cpp-Taskflow’s programming model and interface, software architecture and execution flow. Then, we improve Cpp-Taskflow in several aspects. First, we enhance Cpp-Taskflow’s usability through restructuring the software architecture. Second, we introduce task graph composition to support composability and modularity, which makes it easier for users to construct large and complex parallel patterns. Third, we add a new task type in Cpp-Taskflow to let users control the graph execution flow. This feature empowers the graph model with the ability to describe complex control flow. Aside from the above enhancements, we have designed a new scheduler to adaptively manage the threads based on available parallelism. The new scheduler uses a simple and effective strategy which can not only prevent resource from being underutilized, but also mitigate resource over-subscription

关键词： Electronic design automation parallel programming

来源：评论

学校读者我要写书评

暂无评论

parallel programming with Coq: Map and Reduce Skeletons on Trees 19

Parallel Programming with Coq: Map and Reduce Skeletons on T...

引用

34th ACM/SIGAPP Annual International Symposium on Applied Computing (SAC)

作者： Philippe, Jolan Loulergue, Frederic No Arizona Univ Sch Informat Comp & Cyber Syst Flagstaff AZ 86011 USA

ISBN: (纸本)9781450359337

SyDPaCC is a set of libraries for the Coq interactive theorem prover. It allows to develop correct functional parallel programs on distributed lists based on the transformation of naive sequential programs that are considered as specifications. To offer the parallelization of functions on other data structures, the first step is to implement a parallel version of the considered data structure and to provide parallel implementations of primitive functions manipulating it. This paper presents such a first step: a binary tree extension which includes new map and reduce pure functional algorithmic skeletons for binary trees. Such algorithmic skeletons are templates of parallel algorithms, realized in a functional context as higherorder functions implemented in parallel. The use of these new primitives is illustrated on example applications.

关键词： Functional programming parallel programming Coq

来源：评论

学校读者我要写书评

暂无评论

Automatic generation and assessment of student assignments for parallel programming learning 10th

Automatic generation and assessment of student assignments f...

引用

10th International Symposium on parallel Architectures, Algorithms and programming, PAAP 2019

作者： Luo, Zhenxiao Wang, Zelong Wu, Di Hei, Xiaojun Du, Yunfei School of Data and Computer Science Sun Yat-sen University Guangzhou China Guangdong Key Laboratory of Big Data Analysis and Processing Guangzhou510006 China School of Electronic Information and Communications Huazhong University of Science and Technology Wuhan China National Supercomputer Center in Guangzhou Guangzhou China

The course of parallel programming is becoming more and more important for the education of students majoring in computer science. However, it is not easy to learn parallel programming well due to its high theory and practice requirements. In this paper, we design and implement an automatic assignment generation and assessment system to help students learn parallel programming. The assignments can be generated according to user behaviors and thus able to guide students to learn parallel programming step by step. Besides, it can automatically generate an overall assessment of student assignments by using fuzzy string matching, which provides an approximate reference score of objective questions. Subjective questions can be assessed directly by comparing the answer to the reference answer. This system also provides a friendly user interface for students to complete online assignments and let teachers manage their question database. In our teaching practice, students can learn parallel programming more effectively with the help of such an assignment generation and assessment system. © Springer Nature Singapore Pte Ltd. 2020.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Specifics of semantics of a statically typed language of functional and dataflow parallel programming 21

Specifics of semantics of a statically typed language of fun...

引用

21st Conference on Scientific Services and Internet, SSI 2019

作者： Legalov, Alexander Legalov, Igor Matkovskii, Ivan Siberian Federal University 79 Svobodny pr. Krasnoyarsk660041 Russia

It is proposed to add a static system of types to the dataflow functional model of parallel computing and the dataflow functional parallel programming language developed on its basis. The use of static typing increases the possibility of transforming dataflow functional parallel programs into programs running on modern parallel computing systems. Language constructions are proposed. Their syntax and semantics are described. It is noted that the need to use the single assignment principle in the formation of data storages of a particular type. The features of instrumental support of the proposed approach are considered. Copyright © 2020 for this paper by its authors.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：