检索结果-内蒙古大学图书馆

arXiv 2023年

作者： Bruun, Lotte Maria Larsen, Ulrik Stuhr Hinnerskov, Nikolaj Oancea, Cosmin University of Copenhagen Denmark

We present and evaluate the Futhark implementation of reverse-mode automatic differentiation (AD) for the basic blocks of parallel programming: reduce, prefix sum (scan), and reduce by index. We first present derivations of general-case algorithms, and then discuss several specializations that result in efficient differentiation of most cases of practical interest. We report an experiment that evaluates the performance of the differentiated code in the context of GPU execution, and highlights the impact of the proposed specializations as well as the strengths and weaknesses of differentiating at high level vs. low level (i.e., "differentiating the memory"). Copyright © 2023, The Authors. All rights reserved.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Quantifying OpenMP: Statistical Insights into Usage and Adoption

arXiv

引用

arXiv 2023年

作者： Kadosh, Tal Hasabnis, Niranjan Mattson, Timothy Pinter, Yuval Oren, Gal Department of Computer Science Ben-Gurion University Israel Israel Atomic Energy Commission Intel Labs United States Scientific Computing Center Nuclear Research Center Negev Israel Department of Computer Science Technion - Israel Institute of Technology Israel

In high-performance computing (HPC), the demand for efficient parallel programming models has grown dramatically since the end of Dennard Scaling and the subsequent move to multi-core CPUs. OpenMP stands out as a popular choice due to its simplicity and portability, offering a directive-driven approach for shared-memory parallel programming. Despite its wide adoption, however, there is a lack of comprehensive data on the actual usage of OpenMP constructs, hindering unbiased insights into its popularity and evolution. This paper presents a statistical analysis of OpenMP usage and adoption trends based on a novel and extensive database, HPCORPUS, compiled from GitHub repositories containing C, C++, and Fortran code. The results reveal that OpenMP is the dominant parallel programming model, accounting for 45% of all analyzed parallel APIs. Furthermore, it has demonstrated steady and continuous growth in popularity over the past decade. Analyzing specific OpenMP constructs, the study provides in-depth insights into their usage patterns and preferences across the three languages. Notably, we found that while OpenMP has a strong "common core" of constructs in common usage (while the rest of the API is less used), there are new adoption trends as well, such as simd and target directives for accelerated computing and task for irregular parallelism. Overall, this study sheds light on OpenMP's significance in HPC applications and provides valuable data for researchers and practitioners. It showcases OpenMP's versatility, evolving adoption, and relevance in contemporary parallel programming, underlining its continued role in HPC applications and beyond. These statistical insights are essential for making informed decisions about parallelization strategies and provide a foundation for further advancements in parallel programming models and techniques. HPCORPUS, as well as the analysis scripts and raw results, are available at: https://***/Scientific-Computing-Lab-NRCN/HP

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Fine-grained parallelism framework with predictable work-stealing for real-time multiprocessor systems

引用

JOURNAL OF SYSTEMS ARCHITECTURE 2022年第0期124卷 102393-102393页

作者： Schmid, Michael Fritz, Florian Mottok, Juergen Regensburg Univ Appl Sci Lab Safe & Secure Syst Regensburg Germany

Lately, parallel task models have received much attention in the development of real-time multiprocessor systems, as they allow highly compute-intensive tasks to have shorter deadlines which is very much required in modern reactive systems. However, missing modularity and portability can make parallel programming a cumbersome endeavor. As a consequence, compute-intensive sectors in the desktop and server segment have relied on parallelism frameworks such as Intel Threading Building Blocks, Cilk and OpenMP. These parallelism frameworks, however, are optimized for decent average case performance and consequently, do not meet the strict requirements imposed by real-time *** this paper, we present a proof-of-concept parallelism framework which was implemented in particular for soft real-time systems and having tight timing and safety requirements of such critical systems in mind. The proposed runtime system implements static memory allocation in a work-stealing environment that conforms to the strict space and tight probabilistic time bounds of work-stealing schedulers. Furthermore, we evaluate the performance of this framework by conducting multiprogrammed benchmarks on a real-time embedded multicore architecture.

关键词： Real-time parallel programming Work-stealing Thread pool Task model

来源：评论

学校读者我要写书评

暂无评论

Shared memory parallelism in Modern C++ and HPX

arXiv

引用

arXiv 2023年

作者： Diehl, Patrick Brandt, Steven R. Kaiser, Hartmut Center of Computation & Technology Louisiana State University Digital Media Center Baton RougeLA70803 United States Department of Physics and Astronomy Louisiana State University Street Baton RougeLA70803 United States

parallel programming remains a daunting challenge, from struggling to express a parallel algorithm without cluttering the underlying synchronous logic to describing which devices to employ to calculate correctness. Over the years, numerous solutions have arisen, requiring new programming languages, extensions to programming languages, or adding pragmas. Support for these various tools and extensions is available to varying degrees. In recent years, the C++ standards committee has worked to refine the language features and libraries needed to support parallel programming on a single computational node. Eventually, all major vendors and compilers will provide robust and performant implementations of these standards. Until then, the HPX library and runtime provide cutting-edge implementations of the standards and proposed standards and extensions. Because of these advances, it is now possible to write high performance parallel code without custom extensions to C++. We provide an overview of modern parallel programming in C++, describing the language and library features and providing brief examples of how to use them. Copyright © 2023, The Authors. All rights reserved.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

The Italian research on HPC key technologies across EuroHPC 21

The Italian research on HPC key technologies across EuroHPC

引用

18th ACM International Conference on Computing Frontiers 2021, CF 2021

作者： Aldinucci, Marco Agosta, Giovanni Andreini, Antonio Ardagna, Claudio A. Bartolini, Andrea Cilardo, Alessandro Cosenza, Biagio Danelutto, Marco Esposito, Roberto Fornaciari, William Giorgi, Roberto Lengani, Davide Montella, Raffaele Olivieri, Mauro Saponara, Sergio Simoni, Daniele Torquati, Massimo Di University of Torino Cini HPC-KTT Laboratory Torino Italy Deib Politecnico di Milano Cini HPC-KTT Laboratory Milano Italy Dief University of Florence Cini HPC-KTT Laboratory Firenze Italy Università Degli Studi di Milano Cini HPC-KTT Laboratory Milano Italy Dei Università di Bologna Cini HPC-KTT Laboratory Bologna Italy University of Naples Federico Ii Cini HPC-KTT Laboratory Napoli Italy University of Salerno Cini HPC-KTT Laboratory Salerno Italy University of Pisa Cini HPC-KTT Laboratory Pisa Italy Diism University of Siena Cini HPC-KTT Laboratory Siena Italy Dime University of Genova Cini HPC-KTT Laboratory Genova Italy DiST University of Naples Parthenope Cini HPC-KTT Laboratory Napoli Italy Sapienza University of Rome Cini HPC-KTT Laboratory Roma Italy DII-University of Pisa Cini HPC-KTT Laboratory Pisa Italy

ISBN: (纸本)9781450384049

High-Performance Computing (HPC) is one of the strategic priorities for research and innovation worldwide due to its relevance for industrial and scientific applications. We envision HPC as composed of three pillars: infrastructures, applications, and key technologies and tools. While infrastructures are by construction centralized in large-scale HPC centers, and applications are generally within the purview of domain-specific organizations, key technologies fall in an intermediate case where coordination is needed, but design and development are often decentralized. A large group of Italian researchers has started a dedicated laboratory within the National Interuniversity Consortium for Informatics (CINI) to address this challenge. The laboratory, albeit young, has managed to succeed in its first attempts to propose a coordinated approach to HPC research within the EuroHPC Joint Undertaking, participating in the calls 2019 - 20 to five successful proposals for an aggregate total cost of 95M€. In this paper, we outline the working group's scope and goals and provide an overview of the five funded projects, which become fully operational in March 2021, and cover a selection of key technologies provided by the working group partners, highlighting their usage development within the projects. © 2021 ACM.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Multi-prediction metropolis hastings resampling filtering algorithm based on CUDA

引用

MICROPROCESSORS AND MICROSYSTEMS 2022年 93卷

作者： Huang, Kaijie Cao, Jie Lanzhou Univ Technol Lanzhou Peoples R China

Aiming at the problems of accuracy, speed reduction and estimation accuracy loss caused by MH instead of sequential importance resampling in Metropolis Hastings resampling particle filter algorithm, this paper proposes a parallel Metropolis hasting filter algorithm based on a multi-prediction framework, which loads particles Filtering shifts from resampling to prediction and update steps. The overhead of the Multi-prediction framework can be easily compensated by parallel implementation. This algorithm reduces global sequential operations by adding local parallel computing. Simulation experiments prove that the real-time performance and state estimation accuracy of this method have been improved.

关键词： CUDA parallel architecture parallel programming Multi-prediction model Particle filter

来源：评论

学校读者我要写书评

暂无评论

Safety Hints for HTM Capacity Abort Mitigation

Safety Hints for HTM Capacity Abort Mitigation

引用

IEEE Symposium on High-Performance Computer Architecture

作者： Anirudh Jain Divya Kiran Kadiyala Alexandros Daglis School of Computer Science Georgia Institute of Technology Atlanta Georgia USA School of Electrical and Computer Engineering Georgia Institute of Technology Atlanta Georgia USA

Hardware Transactional Memory (HTM) is a high-performance instantiation of the powerful programming abstraction of transactional memory, which simplifies the daunting— yet critically important—task of parallel programming. While many HTM implementations with variable complexity exist in the literature, commercially available HTMs impose rigid restrictions to transaction and system behavior, limiting their practical use. A key constraint is the limited size of supported transactions, implicitly capped by hardware buffering capacity. We identify the opportunity to expand the effective capacity of these limited hardware structures by being more selective in memory accesses that need to be tracked. We leverage compiler and virtual memory support to identify safe memory accesses, which can never cause a transaction abort, subsequently passed as safety hints to the underlying HTM. With minor extensions over a conventional HTM implementation, HinTM uses these hints to selectively allocate transactional state tracking resources to unsafe accesses only, thus expanding the HTM’s effective capacity, and conversely reducing capacity aborts. We demonstrate that HinTM effectively augments the performance of a range of baseline HTM configurations. When coupled with a POWER8 HTM implementation, HinTM eliminates 64% of transactional capacity aborts, achieving 1.4× average speedup, and up to 8.7×.

关键词： Couplings Limiting parallel programming Memory management Performance gain Hardware Safety

来源：评论

学校读者我要写书评

暂无评论

A New Wave in HDFS Data Security: Merging AES & MapReduce for Efficient Data Encryption

A New Wave in HDFS Data Security: Merging AES & MapReduce fo...

引用

International Carnahan Conference on Security Technology

作者： Yash Watarkar Avi Jain Dipesh Shah Aliasgar Thanawala Aparna Kamble School of CET Dr. Vishwanath Karad MIT World Peace University Pune India

ISBN: (数字)9798350315875

ISBN: (纸本)9798350315882

This paper proposes a robust encryption strategy for data protection within a Hadoop Distributed File System (HDFS) environment by integrating Advanced Encryption Standard (AES) and MapReduce. Leveraging the speed of the AES-128bit encryption algorithm in conjunction with the MapReduce parallel programming paradigm, the method achieves superior efficiency in the encryption of large amounts of crucial data. Furthermore, the implementation utilizes Phil Rogaway's XEX (Xor-Encrypt-Xor) XTS mode, which provides a robust defense against ciphertext manipulation and copy-and-paste attacks. This approach employs parallel mappers and reducers, known as AES-MR, to encrypt data chunks sequentially and concurrently. The paper demonstrates the efficacy and security of this method, suggesting it as a viable safety measure for safeguarding user-generated data in the HDFS context.

关键词： parallel programming File systems Merging Data protection Encryption Safety Standards

来源：评论

学校读者我要写书评

暂无评论

Degrees of Separation: A Flexible Type System for Data Race Prevention

arXiv

引用

arXiv 2023年

作者： Xu, Yichen Odersky, Martin EPFL Switzerland

Data race is a notorious problem in parallel programming. There has been great research interest in type systems that statically prevent data races. Despite the progress in the safety and usability of these systems, lots of existing approaches enforce strict anti-aliasing principles to prevent data races. The adoption of them is often intrusive, in the sense that it invalidates common programming patterns and requires paradigm shifts. We propose Capture Separation Calculus (System CSC), a calculus based on Capture Calculus (System CC), that achieves static data race freedom while being non-intrusive. It allows aliasing in general to permit common programming patterns, but tracks aliasing and controls them when that is necessary to prevent data races. We study the formal properties of System CSC by establishing its type safety and data race freedom. Notably, we establish the data race freedom property by proving the confluence of its reduction semantics. To validate the usability of the calculus, we implement it as an extension to the Scala 3 compiler, and use it to type-check the examples. © 2023, CC BY.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

A One Year Retrospective on a MOOC in parallel, Concurrent, and Distributed programming in Java

A One Year Retrospective on a MOOC in Parallel, Concurrent, ...

引用

IEEE/ACM Workshop on Education for High-Performance Computing (EduHPC)

作者： Sarkar, Vivek Grossman, Max Budimlic, Zoran Imam, Shams Georgia Inst Technol Atlanta GA 30332 USA Rice Univ Houston TX 77251 USA Two Sigma New York NY USA

ISBN: (纸本)9781728101903

Much progress has been made on integrating parallel programming into the core Computer Science curriculum of top-tier universities in the United States. For example, "COMP 322: Introduction to parallel programming" at Rice University is a required course for all undergraduate students pursuing a bachelors degree. It teaches a wide range of parallel programming paradigms, from task-parallel to SPMD to actor-based programming. However, courses like COMP 322 do little to support members of the Computer Science community that need to develop these skills but who are not currently enrolled in a four-year program with parallel programming in the curriculum. This group includes (1) working professionals, (2) students at USA universities without parallel programming courses, or (3) students in countries other than the USA without access to a parallel programming course. To serve these groups, Rice University launched the "parallel, Concurrent, and Distributed programming in Java" Coursera specialization on July 31, 2017. In 2017, the authors of that specialization also wrote an experiences paper about launching the specialization. In this paper, the sequel to our previous publication, we look back at the first year of the Coursera specialization. In particular, we ask the following questions: (1) how did our assumptions about the student body for this course hold up?, (2) how has the course changed since launch?, and (3) what can we learn about how students are progressing through the specialization from Coursera's built-in analytics?

关键词： parallel programming pedagogy concurrent distributed online MOOC Coursera

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：