检索结果-内蒙古大学图书馆

IEEE/ACM Workshop on Education for High-Performance Computing (EduHPC)

作者： Ayguade, Eduard Alvarez, Lluc Banchelli, Fabio Burtscher, Martin Gonzalez-Escribano, Arturo Gutierrez, Julian Joiner, David A. Kaeli, David Previlon, Fritz Rodriguez-Gutiez, Eduardo Bunde, David P. Univ Politecn Cataluna Barcelona Supercomp Ctr Barcelona Spain Texas State Univ San Marcos TX USA Univ Valladolid Valladolid Spain Northeastern Univ Boston MA 02115 USA Kean Univ Union NJ USA Knox Coll Galesburg IL USA

ISBN: (纸本)9781728101903

Peachy parallel Assignments are a resource for instructors teaching parallel and distributed programming. These are high-quality assignments, previously tested in class, that are readily adoptable. This collection of assignments includes implementing a subset of OpenMP using pthreads, creating an animated fractal, image processing using histogram equalization, simulating a storm of high-energy particles, and solving the wave equation in a variety of settings. All of these come with sample assignment sheets and the necessary starter code.

关键词： parallel computing education High-Performance Computing education parallel programming OpenMP Pthreads CUDA Compiler and runtime systems Fractals Image processing Particle simulation Wave equation Peachy Assignments

来源：评论

学校读者我要写书评

暂无评论

romeoLAB: A High Performance Training Platform for HPC, GPU and DeepLearning 4th

romeoLAB: A High Performance Training Platform for HPC, GPU ...

引用

4th Latin American Conference on High Performance Computing (CARLA)

作者： Renard, Arnaud Etancelin, Jean-Matthieu Krajecki, Michael Univ Reims CReSTIC Ctr Rech STIC EA3804 ROMEO HPC Ctr Moulin Housse F-51687 Reims France Univ Reims CReSTIC Ctr Rech STIC EA3804 Dept Comp Sci Moulin Housse F-51687 Reims France

ISBN: (纸本)9783319733531;9783319733524

In this pre-exascale era, we are observing a dramatic increase of the necessity of computer science courses dedicated to parallel programming on heterogeneous architectures. The full hybrid cluster Romeo has been used in that purpose since a long time in order to train master students and cluster users. The main issue for trainees is the cost of accessing and exploiting a production facility in a pedagogic context. The use of some specific techniques and software (SSH, workload manager, remote file system, ...) is mandatory without being part of courses prerequisites nor pedagogic objectives. The romeoLAB platform we developed at ROMEO HPC Center is an online interactive pedagogic platform for HPC and GPU technologies courses. Its main purpose is to simplify the process of resources usage in order to focus on the taught subjects. This paper presents the romeoLAB architecture as well as its motivations, usages and future improvements.

关键词： programming education Online education HPC GPU parallel programming Web application Teaching and learning

来源：评论

学校读者我要写书评

暂无评论

Attached and Detached Closures in Actors 8

Attached and Detached Closures in Actors

引用

8th ACM SIGPLAN International Workshop on programming Based on Actors, Agents, and Decentralized Control (AGERE)

作者： Castegren, Elias Clarke, Dave Fernandez-Reyes, Kiko Wrigstad, Tobias Yang, Albert Mingkun KTH Royal Inst Technol Stockholm Sweden Uppsala Univ Informat Technol Uppsala Sweden Uppsala Univ Uppsala Sweden

ISBN: (纸本)9781450360661

Expressive actor models combine aspects of functional programming into the pure actor model enriched with futures. Such functional features include first-class closures which can be passed between actors and chained on futures. Combined with mutable objects, this opens the door to race conditions. In some situations, closures may not be evaluated by the actor that created them yet may access fields or objects owned by that actor. In other situations, closures may be safely fired off to run as a separate task. This paper discusses the problem of who can safely evaluate a closure to avoid race conditions, and presents the current solution to the problem adopted by the Encore language. The solution integrates with Encore's capability type system, which influences whether a closure is attached and must be evaluated by the creating actor, or whether it can be detached and evaluated independently of its creator. Encore's current solution to this problem is not final or optimal. We conclude by discussing a number of open problems related to dealing with closures in the actor model.

关键词： closures parallel programming concurrent programming type systems

来源：评论

学校读者我要写书评

暂无评论

Accelerating the RICH Particle Detector Algorithm on Intel Xeon Phi 26

Accelerating the RICH Particle Detector Algorithm on Intel X...

引用

26th Euromicro International Conference on parallel, Distributed, and Network-Based Processing (PDP)

作者： Quast, Christina Schwemmer, Rainer Pohl, Angela Cosenza, Biagio Juurlink, Ben CERN Meyrin Canton Geneva Switzerland Tech Univ Berlin Berlin Germany

ISBN: (纸本)9781538649756

At the LHC, particles are collided in order to understand how the universe was created. Those collisions are called events and generate large quantities of data, which have to be pre-filtered before they are stored to hard disks. This paper presents a parallel implementation of these algorithms that is specifically designed for the Intel Xeon Phi Knights Landing platform, exploiting its 64 cores and AVX-512 instruction set. It shows that a linear speedup up until approximately 64 threads is attainable when vectorization is used, data is aligned to cache line boundaries, program execution is pinned to MCDRAM, mathematical expressions are transformed to a more efficient equivalent formulation, and OpenMP is used for parallelization. The code was transformed from being compute bound to memory bound. Overall, a speedup of 36.47x was reached while obtaining an error which is smaller than the detector resolution.

关键词： Intel Xeon Phi Knights Landing OpenMP Vectorization parallel programming

来源：评论

学校读者我要写书评

暂无评论

Automatic runtime calculation of communications for data-parallel expressions with periodic conditions

引用

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE 2019年第5期31卷

作者： Moreton-Fernandez, Ana Gonzalez-Escribano, Arturo Univ Valladolid Dept Informat Valladolid Spain

Many real-world applications feature data accesses on periodic domains. Manually implementing the synchronizations and communications associated to the data dependences on each case is cumbersome and error-prone. It is increasingly interesting to support these applications in high-level parallel programming languages or parallelizing compilers. In this paper, we present a technique that, for distributed-memory systems, calculates the specific communications derived from data-parallel codes with or without periodic boundary conditions on affine access expressions. It makes transparent to the programmer the management of aggregated communications for the chosen data partition. Our technique moves to runtime part of the compile-time analysis typically used to generate the communication code for affine expressions, introducing a complete new technique that also supports the periodic boundary conditions. We present an experimental study to evaluate our proposal using several study cases. Our experimental results show that our approach can automatically obtain communication codes as efficient as those found in MPI reference codes, reducing the development effort.

关键词： communications distributed memory parallel programming periodic boundary condition

来源：评论

学校读者我要写书评

暂无评论

Effects of Latency Jitter on Simulator Sickness in a Search Task 25

Effects of Latency Jitter on Simulator Sickness in a Search ...

引用

25th IEEE Conference on Virtual Reality and 3D User Interfaces (IEEE VR)

作者： Stauffert, Jan-Philipp Niebling, Florian Latoschik, Marc Erich Univ Wurzburg Wurzburg Germany

ISBN: (纸本)9781538633656

Low latency is a fundamental requirement for Virtual Reality (VR) systems to reduce the potential risks of cybersickness and to increase effectiveness, efficiency and user experience. In contrast to the effects of uniform latency degradation, the influence of latency jitter on user experience in VR is not well researched, although today's consumer VR systems are vulnerable in this respect. In this work we report on the impact of latency jitter on cybersickness in HMD-based VR environments. Test subjects are given a search task in Virtual Reality, provoking both head rotation and translation. One group experienced artificially added latency jitter in the tracking data of their head-mounted display. The introduced jitter pattern was a replication of a real-world latency behavior extracted and analyzed from an existing example VR-system. The effects of the introduced latency jitter were measured based on self-reports simulator sickness questionnaire (SSQ) and by taking physiological measurements. We found a significant increase in self-reported simulator sickness. We therefore argue that measure and control of latency based on average values taken at a few time intervals is not enough to assure a required timeliness behavior but that latency jitter needs to be considered when designing experiences for Virtual Reality.

关键词： D.1.3 [programming Techniques]: Concurrent programming parallel programming D.4.8 [Operating Systems]: Performance Measurements H.5.1 [Information Interfaces and Presentation]: Multimedia Information Systems Artificial, augmented, and virtual realities

来源：评论

学校读者我要写书评

暂无评论

parallel Power Flow based on OpenMP

Parallel Power Flow based on OpenMP

引用

North American Power Symposium (NAPS)

作者： Ahmadi, Afshin Jin, Shuangshuang Smith, Melissa C. Collins, E. Randolph Goudarzi, Arman Clemson Univ Holcombe Dept Elect & Comp Engn Clemson SC 29631 USA Clemson Univ Sch Comp Clemson SC 29631 USA Univ KwaZulu Natal Discipline Elect Elect & Comp Engn ZA-4001 Durban South Africa

ISBN: (纸本)9781538671382

Integration of intermittent renewable energy resources to the power system necessitates the development of fast computational methods and tools to enable real-time monitoring, control, and decision making in the power grid. Generally, techniques which can be used to increase the computational speed are summarized in algorithm improvement and hardware acceleration. In this paper, the serial version of the Newton-Raphson power flow algorithm has been transformed to a parallel solution by using OpenMP standard. The parallel implementation is tested on several power systems and the computational efficiency is compared with varying thread numbers. The experimental results show more than three times speedup ratio achievement and significant computational time reduction.

关键词： Power Flow Analysis parallel programming OpenMP Standard Newton-Raphson Method

来源：评论

学校读者我要写书评

暂无评论

Unobtrusive Support for Asynchronous GUI Operations with Java Annotations 32

Unobtrusive Support for Asynchronous GUI Operations with Jav...

引用

32nd IEEE International parallel and Distributed Processing Symposium (IPDPS)

作者： Mehrabi, Mostafa Giacaman, Nasser Sinnen, Oliver Univ Auckland Dept Elect & Comp Engn Parallel & Reconfigurable Comp Lab Auckland New Zealand

ISBN: (纸本)9781538655559

The complexities involved in parallel programming encourage frameworks to detach programmers from these concerns via higher-level abstraction. The high-performance nature of parallel computing drifts the focus of these programming environments towards facilitating and safeguarding faster computations. Therefore, aspects such as asynchronous graphical user interfaces (GUIs) do not see as much emphasis, even though many applications today depend on concurrent human-computer interactions. The significance of this topic is growing such that facilitating the efficient management of asynchronous GUI operations is currently a virtue, but will soon become necessary for parallel-programming frameworks. This paper discusses an unobtrusive and annotation-based approach for managing different types of asynchronous GUI operations within the layout of familiar sequential code. The proposed solution minimizes the restructuring of sequential code, in order to simplify developing, testing and maintaining GUI-based applications. Furthermore, the paper presents an implementation of the concept for @PT, a parallel programming environment based on Java annotations. The evaluation discussed in this paper suggests that the proposed mechanism is valid, and demonstrates timely and efficient handling of asynchronous GUI operations.

关键词： parallel programming asynchronous GUI responsiveness Java annotations @PT

来源：评论

学校读者我要写书评

暂无评论

Teaching Concurrent and Distributed programming With Concepts Over Mathematical Proofs

Teaching Concurrent and Distributed Programming With Concept...

引用

Workshop on Education for High Performance Computing (EduHPC)

作者： David Marchant Carl-Johannes Johnsen Brian Vinter Kenneth Skovhede University of Copenhagen Copenhagen Denmark

ISBN: (纸本)9781728159768

This paper describes how a concept-based approach to teaching was used to update how concurrent and distributed systems were taught at the University of Copenhagen. This approach focuses on discussion to drive student engagement whilst fostering a deeper understanding of the presented topics compared to more traditional displays of crude facts. The course is split into three sections: local concurrency, networked concurrency, and concurrency in hardware. This allows for an easier student journey through the course, as they are introduced to all core concepts in the first section, then have them reinforced in greater detail in the subsequent sections. Finally, the experience gained in updating this course is presented so others attempting to do similar may learn from it.

关键词： Education Hardware Concurrent computing Software Physics parallel programming

来源：评论

学校读者我要写书评

暂无评论

Activity Based Approach for Teaching parallel Computing: An Indian Experience

Activity Based Approach for Teaching Parallel Computing: An ...

引用

IEEE International Symposium on parallel and Distributed Processing Workshops and Phd Forum (IPDPSW)

作者： P. Chitra Sheikh K. Ghafoor Thiagarajar College of Engineering Madurai India Tennessee Tech University USA

Due to the rapid growth in the multicore and GPU based computing devices, the need to teach parallel computing in CS/CE curriculum has become almost mandatory nowadays. A course on parallel Computing Systems (PCS) has been designed to provide an understanding of the fundamental principles and engineering trade-offs involved in designing modern parallel computing systems as well as to teach parallel programming techniques necessary to effectively utilize these machines. An activity based learning approach was adopted for teaching the course and several parallel programming paradigms and technologies such OpenMP, MPI, and CUDA have been covered. This course was offered as a required course to graduate students. This paper describes the implementation of the course at Thiagarajar College of Engineering. Evaluation of the implementation of the course reveals that for students who have not been exposed to parallel and distributed computing, i) activity based learning results in better knowledge gain compared to the traditional approach, ii) learning OpenMP was much easier than MPI or CUDA, iii) some parallel and Distributed Computing (PDC) concepts such as false sharing were harder to grasp compared to basic concepts, and iv) it is essential to introduce parallel computing in the undergraduate curriculum.

关键词： parallel processing Graphics processing units Education parallel programming Distributed computing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：