检索结果-内蒙古大学图书馆

IEEE International Conference on e-Science and Grid Computing

作者： J. Gregory Pauloski Valerie Hayot-Sasson Maxime Gonthier Nathaniel Hudson Haochen Pan Sicheng Zhou Ian Foster Kyle Chard Department of Computer Science University of Chicago Chicago IL USA Data Science and Learning Division Argonne National Laboratory Lemont IL USA

ISBN: (数字)9798350365610

ISBN: (纸本)9798350365627

Task-based execution frameworks, such as parallel programming libraries, computational workflow systems, and function-as-a-service platforms, enable the composition of distinct tasks into a single, unified application designed to achieve a computational goal and abstract the parallel and distributed execution of those tasks on arbitrary hardware. Research into these task executors has accelerated as computational sciences increasingly need to take advantage of parallel compute and/or heterogeneous hardware. However, the lack of evaluation standards makes it challenging to compare and contrast novel systems against existing implementations. Here, we introduce TaPS, the Task Performance Suite, to support continued research in distributed task executor frameworks. TaPS provides (1) a unified, modular interface for writing and evaluating applications using arbitrary execution frameworks and data management systems and (2) an initial set of reference synthetic and real-world science applications. We discuss how the design of TaPS supports the reliable evaluation of frameworks and demonstrate TaPS through a survey of benchmarks using the provided reference applications.

关键词： Surveys Performance evaluation parallel programming Scalability Benchmark testing Writing Metadata

来源：评论

学校读者我要写书评

暂无评论

AOmpLib: An Aspect Library for Large-Scale Multi-Core parallel programming

AOmpLib: An Aspect Library for Large-Scale Multi-Core Parall...

引用

42nd Annual International Conference on parallel Processing (ICPP)

作者： Medeiros, Bruno Sobral, Joao L. Univ Minho Dept Informat CCTC Braga Portugal

ISBN: (纸本)9780769551173

This paper introduces an aspect-oriented library aimed to support efficient execution of Java applications on multi-core systems. The library is coded in AspectJ and provides a set of parallel programming abstractions that mimics the OpenMP standard. The library supports the migration of sequential Java codes to multi-core machines with minor changes to the base code, intrinsically supports the sequential semantics of OpenMP and provides improved integration with object-oriented mechanisms. The aspect-oriented nature of library enables the encapsulation of parallelism-related code into well-defined modules. The approach makes the parallelisation and the maintenance of large-scale Java applications more manageable. Furthermore, the library can be used with plain Java annotations and can be easily extended with application-specific mechanisms in order to tune application performance. The library has a competitive performance, in comparison with traditional parallel programming in Java, and enhances programmability, since it allows an independent development of parallelism-related code.

关键词： Java Aspect-oriented programming parallel programming OpenMP

来源：评论

学校读者我要写书评

暂无评论

Development of a Library for Image Processing Using Openmpi and Openmp

Development of a Library for Image Processing Using Openmpi ...

引用

Electronics and Sustainable Communication Systems (ICESC), 2020 International Conference on

作者： Robinson Oliva-Salazar Wilver Auccahuasi Universidad Cientifica del Sur Lima Perú

ISBN: (数字)9798350379945

ISBN: (纸本)9798350379952

Nowadays, in the different areas of knowledge, there is an increase in the amount of information needed to process, reason why many solutions have been generated for the implementation of high-performance computing, these available solutions depend on many factors, from the use of available different architectures. This research work presents a method for the configuration of a low-cost solution for the implementation of asolution based on HPC, using the OpenMP and OpenMPI libraries. The processes necessary for the implementation of programs to exploit these two libraries that are used in the application of parallel programming are described. As a result, the study presents the application of the methodology using file compression, which was implemented Huffman's algorithm, the results demonstrate the optimization in parallel work working with OpenMP and OpenMPI libraries, which allows working with all processors available in the different computer architectures that are available. The study indicates the mode of use and application of the methodology described.

关键词： Image coding parallel programming Operating systems Linux High performance computing Graphics processing units Computer architecture Libraries Satellite images Optimization

来源：评论

学校读者我要写书评

暂无评论

A Brief Survey of Formal Models of Concurrency

arXiv

引用

arXiv 2024年

作者： Averill, Charles

The ubiquity of networking infrastructure in modern life necessitates scrutiny into networking fundamentals to ensure the safety and security of that infrastructure. The formalization of concurrent algorithms, a cornerstone of networking, is a longstanding area of research in which models and frameworks describing distributed systems are established. Despite its long history of study, the challenge of concisely representing and verifying concurrent algorithms remains unresolved. Existing formalisms, while powerful, often fail to capture the dynamic nature of real-world concurrency in a manner that is both comprehensive and scalable. This paper explores the evolution of formal models of concurrency over time, investigating their generality and utility for reasoning about real-world networking programs. Four foundational papers on formal concurrency are considered: Hoare's parallel programming: An axiomatic approach [3], Milner's A Calculus of Mobile Processes [7], O'Hearn's Resources, Concurrency and Local Reasoning [8], and the recent development of Coq's Iris framework [5]. © 2024, CC BY-NC-ND.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Examining the Expert Gap in parallel programming

Examining the Expert Gap in Parallel Programming

引用

19th International Conference on Euro-Par

作者： Nanz, Sebastian West, Scott da Silveira, Kaue Soares Swiss Fed Inst Technol Zurich Switzerland Google Inc Zurich Switzerland

ISBN: (纸本)9783642400476

parallel programming is often regarded as one of the hardest programming disciplines. On the one hand, parallel programs are notoriously prone to concurrency errors;and, while trying to avoid such errors, achieving program performance becomes a significant challenge. As a result of the multicore revolution, parallel programming has however ceased to be a task for domain experts only. And for this reason, a large variety of languages and libraries have been proposed that promise to ease this task. This paper presents a study to investigate whether such approaches succeed in closing the gap between domain experts and mainstream developers. Four approaches are studied: Chapel, Cilk, Go, and Threading Building Blocks (TBB). Each approach is used to implement a suite of benchmark programs, which are then reviewed by notable experts in the language. By comparing original and revised versions with respect to source code size, coding time, execution time, and speedup, we gain insights into the importance of expert knowledge when using modern parallel programming approaches.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

LASSI: An LLM-Based Automated Self-Correcting Pipeline for Translating parallel Scientific Codes

LASSI: An LLM-Based Automated Self-Correcting Pipeline for T...

引用

IEEE International Conference on Cluster Computing Workshops and Posters (CLUSTER WORKSHOPS)

作者： Matthew T. Dearing Yiheng Tao Xingfu Wu Zhiling Lan Valerie Taylor University of Illinois Chicago USA Argonne National Laboratory USA

ISBN: (数字)9798350383454

ISBN: (纸本)9798350383461

This paper addresses the problem of providing a novel approach to sourcing significant training data for LLMs focused on science and engineering. In particular, a crucial challenge is sourcing parallel scientific codes in the ranges of millions to billions of codes. To tackle this problem, we propose an automated pipeline framework called LASSI, designed to translate between parallel programming languages by bootstrapping existing closed- or open-source LLMs. LASSI incorporates autonomous enhancement through self-correcting loops where errors encountered during the compilation and execution of generated code are fed back to the LLM through guided prompting for debugging and refactoring. We highlight the bidirectional translation of existing GPU benchmarks between OpenMP target offload and CUDA to validate LASSI. The results of evaluating LASSI with different application codes across four LLMs demonstrate the effectiveness of LASSI for generating executable parallel codes, with 80% of OpenMP to CUDA translations and 85% of CUDA to OpenMP translations producing the expected output. We also observe approximately 78% of OpenMP to CUDA translations and 62% of CUDA to OpenMP translations execute within 10% of or at a faster runtime than the original benchmark code in the same language.

关键词： Codes Runtime parallel programming Conferences Large language models Pipelines Graphics processing units Training data Debugging Benchmark testing

来源：评论

学校读者我要写书评

暂无评论

Implementation of Longest Common Subsequence Algorithm Using Thread parallelization in Java

Implementation of Longest Common Subsequence Algorithm Using...

引用

International Conference on Business and Industrial Research (ICBIR)

作者： Mark Phil B. Pacot Gleen A. Dalaorao Department of Computer Science Caraga State University Caraga Region Philippines Department of Information Technology Caraga State University Caraga Region Philippines

ISBN: (数字)9798350383027

ISBN: (纸本)9798350383034

This sequence alignment stands as a pivotal method in the realm of bioinformatics, meticulously employed to ascertain the degree of similarity between diverse sequences such as DNA, RNA, and amino acids. Among the myriad techniques utilized in tackling sequence alignment challenges, the Longest Common Subsequence (LCS) takes center stage. This paper delves into the realm of enhancing LCS efficiency through the implementation of thread parallelization. Drawing inspiration from the seminal work of Wagner and Fischer in 1974, both sequential and parallel techniques exhibit remarkable consistency in identifying the maximum length of LCS. However, this research goes a step further by introducing thread parallelization, which leverages multithreading, resource synchronization, and task decomposition within the domain of parallel programming. The meticulous integration of these advanced techniques results in a notable enhancement in terms of running time compared to the conventional iterative sequential approach. The experimentation and evaluation of both sequential and parallel approaches were conducted using Netbeans, a robust Integrated Development Environment (IDE) tailored for the Java programming Language. The findings underscore the superior performance of the thread parallelization strategy, establishing its prowess in optimizing the execution time of LCS problem resolution.

关键词： Java parallel programming Multithreading Instruction sets RNA Synchronization Bioinformatics parallel algorithms Standards Optimization

来源：评论

学校读者我要写书评

暂无评论

Exploring Fine-grained Task parallelism on Simultaneous Multithreading Cores

arXiv

引用

arXiv 2024年

作者： Los, Denis Petushkov, Igor Moscow Institute of Physics and Technology 9 Institutskiy per. Moscow Region Dolgoprudny141700 Russia

Nowadays, latency-critical, high-performance applications are parallelized even on power-constrained client systems to improve performance. However, an important scenario of fine-grained tasking on simultaneous multithreading CPU cores in such systems has not been well researched in previous works. Hence, in this paper, we conduct performance analysis of state-of-the-art shared-memory parallel programming frameworks on simultaneous multithreading cores using real-world fine-grained application kernels. We introduce a specialized and simple software-only parallel programming framework called Relic to enable extremely fine-grained tasking on simultaneous multithreading cores. Using Relic framework, we increase performance speedups over serial implementations of benchmark kernels by 19.1% compared to LLVM OpenMP, by 31.0% compared to GNU OpenMP, by 20.2% compared to Intel OpenMP, by 33.2% compared to X-OpenMP, by 30.1% compared to oneTBB, by 23.0% compared to Taskflow, and by 21.4% compared to OpenCilk. © 2024, CC BY.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Touch2C: A code conversion method from programming language for swarm intelligent building to C language

Touch2C: A code conversion method from programming language ...

引用

Chinese Control and Decision Conference, CCDC

作者： Wenjie Chen Qiliang Yang Jianchun Xing Shuo Zhao Chenxi Hu Chao Mou College of Defense Engineering Army Engineering University of PLA Nanjing China China Xi’an Satellite Control Center Xi’an China

ISBN: (数字)9798350387780

ISBN: (纸本)9798350387797

The Touch programming language for swarm intelligent building application (APP) development effectively reduces the development difficulty and user programming threshold, making the building more intelligent. However, the features of Touch language such as intuitive modeling of building elements, parallel programming, and the implicit specification of internode communication lead to great challenges in the compilation process of Touch language to the low-level executable object code of swarm intelligent buildings, and the APP development efficiency is not high. This paper proposes a code conversion method from Touch to C language and its supporting tools, designs code conversion algorithms for Touch language elements used to describe distributed building physical objects and parallel computing mode, which supports the automatic conversion of high-level Touch language, which is user-oriented and shielded from the details of the underlying interactions, into the C language code for underlying execution, thus realizing an integrated process from high-level APP development to low-level hardware platform execution and improving the APP development efficiency.

关键词： Codes parallel programming Buildings Semantics C languages Distributed databases parallel processing

来源：评论

学校读者我要写书评

暂无评论

An overview of parallel processing of rectangular determinant calculation

An overview of parallel processing of rectangular determinan...

引用

Mediterranean Conference on Embedded Computing (MECO)

作者： Besnik Duriqi Halil Snopçe Armend Salihu Artan Luma Faculty of Computer Science South East European University - SEEU Republic of North Macedonia Faculty of Computer Science Republic of North Macedonia Department of Computer Science UNI Universum International College Prishtina Republic of Kosovo

ISBN: (数字)9798350387568

ISBN: (纸本)9798350387575

This paper focuses on developing algorithms for parallel determinant processing, a crucial task in linear algebra and computational mathematics. The aim is to improve efficiency in high-performance computing environments by designing and analyzing algorithms that use parallel processing to expedite determinant computation for various matrices range. The research explores methods like Laplace expansion, LU decomposition, eigenvalue decomposition, Gaussian elimination, and cofactor expansion, assessing their efficiency, scalability, and applicability in different computational environments. The study employs advanced parallel programming techniques and architectures, utilizing multi-core processors with the focus aim into utilization of Chio’s method of rectangular determinants processing in parallel etc. The research also investigates the mathematical underpinnings of parallel determinant algorithms, addressing challenges like load balancing, data distribution, and synchronization. The results show significant improvements in determinant calculations efficiency, reducing computation times for large matrices.

关键词： Multicore processing parallel programming Scalability Signal processing algorithms Computer architecture Linear algebra parallel processing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：