检索结果-内蒙古大学图书馆

IEEE/ACM Workshop on Education for High-Performance Computing (EduHPC)

作者： Sarkar, Vivek Grossman, Max Budimlic, Zoran Imam, Shams Georgia Inst Technol Atlanta GA 30332 USA Rice Univ Houston TX 77251 USA Two Sigma New York NY USA

ISBN: (纸本)9781728101903

Much progress has been made on integrating parallel programming into the core Computer Science curriculum of top-tier universities in the United States. For example, "COMP 322: Introduction to parallel programming" at Rice University is a required course for all undergraduate students pursuing a bachelors degree. It teaches a wide range of parallel programming paradigms, from task-parallel to SPMD to actor-based programming. However, courses like COMP 322 do little to support members of the Computer Science community that need to develop these skills but who are not currently enrolled in a four-year program with parallel programming in the curriculum. This group includes (1) working professionals, (2) students at USA universities without parallel programming courses, or (3) students in countries other than the USA without access to a parallel programming course. To serve these groups, Rice University launched the "parallel, Concurrent, and Distributed programming in Java" Coursera specialization on July 31, 2017. In 2017, the authors of that specialization also wrote an experiences paper about launching the specialization. In this paper, the sequel to our previous publication, we look back at the first year of the Coursera specialization. In particular, we ask the following questions: (1) how did our assumptions about the student body for this course hold up?, (2) how has the course changed since launch?, and (3) what can we learn about how students are progressing through the specialization from Coursera's built-in analytics?

关键词： parallel programming pedagogy concurrent distributed online MOOC Coursera

来源：评论

学校读者我要写书评

暂无评论

A Safe and User-Friendly Graphical programming Model for parallel Stream Processing 26

A Safe and User-Friendly Graphical Programming Model for Par...

引用

26th Euromicro International Conference on parallel, Distributed, and Network-Based Processing (PDP)

作者： Sydow, Stefan Nabelsee, Mohannad Parzyjegla, Helge Herber, Paula Tech Univ Berlin Commun & Operating Syst Grp Berlin Germany Univ Rostock Architecture Applicat Syst Grp Rostock Germany Tech Univ Berlin Software & Embedded Syst Engn Grp Berlin Germany

ISBN: (纸本)9781538649756

Writing correct and efficient parallel programs is hard. A lack of overview leads to errors in control- and dataflow, e.g., race conditions, which are hard to find due to their nondeterministic nature. In this paper, we present a graphical programming model for parallel stream processing applications, which improves the overview by visualizing high level dataflow together with explicit and concise annotations for concurrency-related dependency information. The key idea of our approach is twofold: First, we present a powerful graphical task editor together with annotations that enable the designer to define stream properties, task dependencies, and routing information. These annotations facilitate fine-granular and correct parallelization. Second, we propose seamless integration with the safe parallel programming language Rust by providing automated code structure generation from the graphical representation, design patterns for common parallel programming constructs like filters, and a scheduling and runtime environment. We demonstrate the applicability of our approach with a network-based processing system as it is typically found in advanced firewalls.

关键词： stream processing visual programming usability manycore parallel programming

来源：评论

学校读者我要写书评

暂无评论

Porting real-world applications to GPU clusters: A celerity and cronos case study 17

Porting real-world applications to GPU clusters: A celerity ...

引用

17th IEEE International Conference on eScience, eScience 2021

作者： Gschwandtner, Philipp Kissmann, Ralf Huber, David Salzmann, Philip Knorr, Fabian Thoman, Peter Fahringer, Thomas University of Innsbruck Research Center HPC Innsbruck Austria University of Innsbruck Institute for Astro- and Particle Physics Innsbruck Austria University of Innsbruck Department of Computer Science Innsbruck Austria

ISBN: (纸本)9781665403610

Accelerator clusters are an ongoing trend in high performance computing, continuously gaining traction and forming a ubiquitous hardware resource for domain scientists to run large-scale simulations on. However, there is often a gap between new hardware technologies and adoption by legacy code bases. Porting real-world applications to new programming models is a difficult undertaking, aggravated by the need for support for both distributed-memory and accelerator parallelism. In this work, we present a case study of porting Cronos, a real-world code from the field of magnetohydrodynamics, to Celerity, a high-level programming model for distributed-memory accelerator clusters. We discuss the numerical, algorithmic and implementation properties of the application and motivate our decisions for adapting them where necessary. Preliminary results show a parallel efficiency of up to 87% for 16 GPUs. © 2021 IEEE.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

parallel Design Patterns vs parallel Object Compositions. Two Proposals for parallelization of the Divide & Conquer Technique 33

Parallel Design Patterns vs Parallel Object Compositions. Tw...

引用

33rd European Modeling and Simulation Symposium, EMSS 2021

作者： Rossainz-López, Mario Pineda-Torres, Ivo Sánchez-Rinza, Bárbara Capel-Tuñón, Manuel Faculty of Computer Science Autonomous University of Puebla Av. San Claudio and 14 Sur Street Puebla San ManuelC.P. 72570 Mexico Software Engineering Department College of Informatics and Telecommunications ETSIIT University of Granada Daniel Saucedo Aranda s/n Granada18071 Spain

The present work shows the parallelization of the algorithmic design technique Divide & Conquer in two different ways: As a parallel Design Pattern (PDP) through Active Objects and as Composition High-Level parallel (HLPC). The overall purpose is to provide the user and novice programmer with two approaches within the object-oriented programming environment, particularly within the programming of parallel Objects (PO), so that they can develop their programs according to a sequential programming style, automatically obtaining, easy and without much effort, the parallel counterpart of your code with the help of a specific programming environment like the one proposed. It is common for parallel applications to follow predetermined patterns in communication between processes. That is why this proposal proposes two different methods that solve problems with the same parallel control structure. Both methods use Structured parallel programming and parallel Objects. The proposal is specialized in the algorithmic technique of divide & conquer to solve ordering, search, and optimization problems. The default pattern used to communicate problem solving processes is the tree structure. The proposed methods are novel because they offer the programmer the communication pattern between tree-like processes that is already defined in its structure. The programmer is only concerned with implementing the sequential algorithms that solve the problem under the divide & conquer paradigm. Both approaches are effective because they show good speedup analysis, and their usefulness, programmability, and performance are demonstrated. © 2021 The Authors.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Gamma — General Abstract Model for Multiset mAnipulation and dynamic dataflow model: An equivalence study

Gamma — General Abstract Model for Multiset mAnipulation an...

引用

作者： Mello, Rui R. de Araújo, Leandro S. Alves, Tiago A. O. Marzulo, Leandro A. J. Paillard, Gabriel A. L. França, Felipe M. G. Program Systems Engineering and Computer Science Federal University of Rio de Janeiro Rio de Janeiro Brazil Brazilian Navy Research Institute Rio de Janeiro Brazil State University of Rio de Janeiro Rio de Janeiro Brazil Google Mountain ViewCA United States Virtual UFC Institute Federal University of Ceará Ceará Brazil

With the increase of the search for computational models where the expression of parallelism occurs naturally, some paradigms arise as options for the current generation of computers. In this context, dynamic dataflow and Gamma—General Abstract Model for Multiset mAnipulation—emerge as interesting computational model choices. In dynamic dataflow model, operations are performed as soon as their associated operands are available, without rely on a Program Counter to dictate the execution order of instructions. The Gamma paradigm is based on a parallel multiset rewriting scheme. It provides a nondeterministic execution model inspired by an abstract chemical machinemetaphor, where operations are formulated as reactions that occur freely among matching elements belonging to the multiset. In this work, equivalence relations between the dynamic dataflow and Gamma paradigms are exposed and explored, while methods to convert from dataflow to Gamma paradigm and vice versa are provided. It is shown that vertices and edges of a dynamic dataflow graph can correspond, respectively, to reactions and multiset elements in the Gamma paradigm. This work provides the scientific community with the possibility of taking profit of both parallel programming models, contributing with a versatility component to researchers and developers. © 2021 John Wiley & Sons, Ltd.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Compiling with MultiCores

Compiling with MultiCores

引用

Innovation in Technology (INOCON), IEEE International Conference for

作者： Farhanaaz V Sanju School of Computer Science and Applications REVA University Bangalore India Department of Computer Science and Enggineering REVA University Bangalore India

Technology is evolving continuously at an exponential rate in terms of its performance. Traditional processors just have one core, however their computational capacity has been outgrown. Increasing the single core’s clock frequency is one technique to boost the system’s performance. A multitude of problems, including excessive power dissipation and CPU overheating, will arise when the frequency is increased. This issue was resolved by adding more cores to the same CPU to increase processing speed. Multicore processors are the term used to describe this architectural breakthrough. Each core functions independently. Due to these modifications, programmers are forced to code in parallel. We must create code that leverages several cores in parallel. The parallel programming paradigm was developed to create such programmes. The transition from sequential to parallel execution in the software development industry has been greatly influenced by parallel processing. We have OpneMP and MPI constructs, which may be utilised with current languages to parallelize the code.

关键词： Technological innovation Program processors Codes Multicore processing parallel programming Switches parallel processing

来源：评论

学校读者我要写书评

暂无评论

Source-to-source compilation targeting OpenMP-based automatic parallelization of C applications

引用

JOURNAL OF SUPERCOMPUTING 2020年第9期76卷 6753-6785页

作者： Arabnejad, Hamid Bispo, Joao Cardoso, Joao M. P. Barbosa, Jorge G. Univ Porto INESC TEC Porto Portugal Univ Porto Fac Engn Porto Portugal Univ Porto LIACC Porto Portugal

Directive-driven programming models, such as OpenMP, are one solution for exploring the potential parallelism when targeting multicore architectures. Although these approaches significantly help developers, code parallelization is still a non-trivial and time-consuming process, requiring parallel programming skills. Thus, many efforts have been made toward automatic parallelization of the existing sequential code. This article presents AutoPar-Clava, an OpenMP-based automatic parallelization compiler which: (1) statically detects parallelizable loops in C applications;(2) classifies variables used inside the target loop based on their access pattern;(3) supportsreductionclauses on scalar and array variables whenever it is applicable;and (4) generates a C OpenMP parallel code from the input sequential version. The effectiveness of AutoPar-Clava is evaluated by using the NAS and Polyhedral Benchmark suites and targeting a x86-based computing platform. The achieved results are very promising and compare favorably with closely related auto-parallelization compilers, such as Intel C/C++ Compiler (icc), ROSE, TRACO and CETUS.

关键词： Static code analysis Compiler framework parallel programming Code transformations

来源：评论

学校读者我要写书评

暂无评论

Porting and Evaluation of a Distributed Task-driven Stencil-based Application 12

Porting and Evaluation of a Distributed Task-driven Stencil-...

引用

12th International Workshop on programming Models and Applications for Multicores and Manycores, PMAM 2021

作者： Raut, Eric Anderson, Jonathon Araya-Polo, Mauricio Meng, Jie Total Ep Research and Technology HoustonTX United States Stony Brook University Stony BrookNY United States Rice University HoustonTX United States

ISBN: (纸本)9781450383486

Alternative programming models and runtimes are increasing in popularity and maturity. This allows porting and comparing, on competitive grounds, emerging parallel approaches against the traditional MPI+X paradigm. In this work, an implementation of distributed task-based stencil computation is compared with a traditional MPI+X implementation of the same application. The Legion task-based parallel programming system is used as an alternative to MPI, but the underlying OpenMP approach is kept at the subdomain level. Overall results are promising toward making this alternative method competitive to the traditional MPI approach. In future work, extensions to other applications will be explored, as well as the use of GPUs. © 2021 ACM.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Productive programming of Distributed Systems with the SHAD C++ Library 21

Productive Programming of Distributed Systems with the SHAD ...

引用

30th International Symposium on High-Performance parallel and Distributed Computing, HPDC 2021

作者： Castellana, Vito Giovanni Minutoli, Marco Pacific Northwest National Laboratory RichlandWA United States

ISBN: (纸本)9781450382175

High-performance computing (HPC) is often perceived as a matter of making large-scale systems (e.g., clusters) run as fast as possible, regardless the required programming effort. However, the idea of "bringing HPC to the masses"has recently emerged. Inspired by this vision, we have designed SHAD, the Scalable High-performance Algorithms and Data-structures library [1][6]. SHAD is open source software, written in C++, for C++ developers. Unlike other HPC libraries for distributed systems, which rely on SPMD models, SHAD adopts a shared-memory programming abstraction, to make C++ programmers feel at home. Underneath, SHAD manages tasking and data-movements, moving the computation where data resides and taking advantage of asynchrony to tolerate network latency. At the bottom of his stack, SHAD can interface with multiple runtime systems: this not only improves developer's productivity, by hiding the complexity of such software and of the underlying hardware, but also greatly enhance code portability. Thanks to its abstraction layers, SHAD can indeed target different systems, ranging from laptops to HPC clusters, without any need for modifying the user-level *** have prototyped and open-sourced the implementation of (a subset of) the C++ standard library (STL) targeting multi-node HPC clusters. Our work allows plain STL-based C++ code to scale on HPC systems, with no need for rewriting the code to exploit the complex hardware. SHAD is available under Apache v2 License at https://***/pnnl/SHAD. In this paper we overview the design of the SHAD library, depicting its main components: runtime systems abstractions for tasking;parallel and distributed data-structures;STL-compliant interfaces and algorithms. © 2020 Copyright is held by the owner/author(s).

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

parallelization of cryptographic algorithm based on different parallel computing technologies 1

Parallelization of cryptographic algorithm based on differen...

引用

1st Symposium on Information Technologies and Applied Sciences, IT and AS 2021

作者： Mochurad, Lesia Shchur, Glib Artificial intelligence Department Lviv Polytechnic National University Lviv79013 Ukraine

The analysis of efficiency of application of four technologies of parallel programming for parallelization of algorithm of block encryption Advanced Encryption System is carried out in the work. The obtained results showed that the average execution time of this algorithm can be increased three times with a processor and thousands of times with a graphics processor. The advantages and disadvantages of each of the technologies are analyzed. But it is shown that each of them can be suitable for different scenarios. That is why it is important for a programmer to know OpenMP, Java Threads, Java ForkJoin, CUDA and other technologies that exist for parallelization. The software is developed and a number of numerical experiments are carried out. The reliability of the obtained encryption results is confirmed. To eliminate the influence of external factors on the reporting time, the algorithm was performed 10 times in a row and the average value was calculated. The largest increase in speed is obtained using CUDA to parallelize the Advanced Encryption System algorithm and is more than 300000, which is a very significant improvement. After that, we get an acceleration of three times for OpenMP and 2.8 on average for encryption and decryption using both Java technologies. © 2021 Copyright for this paper by its authors.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：