检索结果-内蒙古大学图书馆

15TH ANNUAL INTERNATIONAL COMPUTER SOFTWARE AND APPLICATIONS CONF ( COMPSAC 91 )

作者： GOVINDARAJAN, R GUO, L YU, S WANG, P Department of Electrical Engineering McGill University Montreal H3A 2A7 PQ Canada Department of Computer Science University of Western Ontario London N6A 5B7 ON Canada Department of Mathematics and Computer Sciences Kent State University Kent 44242 OH United States

ISBN: (纸本)0818621524

In this paper, new constructs for synchronization in parallel programming languages are presented for shared memory multiprocessors. The motivation behind the design of these new constructs is to relieve programmers from the burden of imposing synchronization, requiring them only to specify the necessary constraints. Statement tags are introduced. Synchronization is specified by means of regular expressions of statement tags, termed synchronization expressions. Unlike path expressions, our synchronization expressions demand no structural changes on the base language and allow much more complicated synchronization constraints to be expressed and expressed easily. This is due to the use of statement tags and the presence of guards in the latter. We present a few examples to demonstrate the simplicity and the power of synchronization expressions. © 1991 IEEE.

关键词： Multiprocessors parallel constructs parallel programming languages Statement tags Synchronization expressions

来源：评论

学校读者我要写书评

暂无评论

An Adaptive Mesh Refinement Benchmark for Modern parallel programming languages 07

An Adaptive Mesh Refinement Benchmark for Modern Parallel Pr...

引用

ACM/IEEE SC07 Conference

作者： Wen, Tong Su, Jimmy Colella, Phillip Yelick, Katherine Keen, Noel IBM Corp TJ Watson Res Ctr Hawthorne NY 10532 USA Univ Calif Berkeley Berkeley CA 94720 USA Lawrence Berkeley Natl Lab Berkeley CA 94720 USA Univ Calif Berkeley Berkeley Natl Lab Berkeley CA 91330 USA

ISBN: (纸本)9781595939746

We present an Adaptive Mesh Refinement benchmark for evaluating programmability and performance of modern parallel programming languages. Benchmarks employed today by language developing teams, originally designed for performance evaluation of computer architectures, do not fully capture the complexity of state-of-the-art computational software systems running on today's parallel machines or to be run on the emerging ones from the multi-cores to the peta-scale High Productivity Computer Systems. This benchmark, extracted from a real application framework, presents challenges for a programming language in both expressiveness and performance. It consists of an infrastructure for finite difference calculations on block-structured adaptive meshes and a solver for elliptic Partial Differential Equations built on this infrastructure. Adaptive Mesh Refinement algorithms are challenging to implement due to the irregularity introduced by local mesh refinement. We describe those challenges posed by this benchmark through two reference implementations (C++/Fortran/MPI and Titanium) and in the context of three programming models.

关键词： Adaptive mesh refinement Benchmark parallel programming languages Performance Programmability Scalability

来源：评论

学校读者我要写书评

暂无评论

Special issue on advances in techniques for assessment performance portability of HPC applications

引用

Future Generation Computer Systems 2025年 171卷

作者： Marowka, Ami Stpiczyński, Przemyslaw Wyrzykowski, Roman Parallel Research Lab Petach Tikva49729 Israel Institute of Computer Science Maria Curie-Sklodowska University Lublin20-033 Poland Department of Computer Science Czestochowa University of Technology Czestochowa42-201 Poland

This special issue aims to present new developments and advances in techniques for assessment performance portability of high performance computing applications. It contains revised and extended versions of selected papers presented at the 10th Workshop on Language-Based parallel programming Models, WLPP 2024, which was a part of 15th International Conference on parallel Processing and Applied Mathematics, PPAM 2024, held on September 8–11, 2024, in Ostrava, Czech Republic. © 2025

关键词： HPC applications parallel programming languages Performance portability

来源：评论

学校读者我要写书评

暂无评论

PARLOG AND ITS APPLICATIONS

引用

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING 1988年第12期14卷 1792-1804页

作者： CLARK, KL Department of Computing Imperial College London SW7 2B2 England

The key concepts of the parallel logic programming language PARLOG are introduced by comparing the language with Prolog. Some familiarity with Prolog and with the concepts of logic programming is assumed. Two major application areas of PARLOG, systems programming and object oriented programming are illustrated. Other applications are briefly surveyed. This paper is a revision of [3]. [ABSTRACT FROM AUTHOR]

关键词： Communicating processes logic programming applications logic programming languages object oriented programming parallel programming languages

来源：评论

学校读者我要写书评

暂无评论

Towards a more efficient implementation of OpenMP for clusters via translation to global arrays

引用

parallel COMPUTING 2005年第10-12期31卷 1114-1139页

作者： Huang, L Chapman, B Liu, ZY Univ Houston Dept Comp Sci Houston TX 77204 USA

This paper discusses a novel approach to implementing OpenMP on clusters, Traditional approaches to do so rely on Software Distributed Shared Memory systems to handle shared data. We discuss these and then introduce an alternative approach that translates OpenMP to Global Arrays (GA), explaining the basic strategy. GA requires a data distribution. We do not expect the user to supply this, rather, we show how we perform data distribution and work distribution according to the user-supplied OpenMP static loop schedules. An inspector executor strategy is employed for irregular applications in order to gather information on accesses to potentially non-local data, group non-local data transfers and overlap communications with local computations. Furthermore, a new directive INVARIANT is proposed to provide information about the dynamic scope of data access patterns. This directive can help us generate efficient codes for irregular applications using the inspector executor approach, We also illustrate how to deal with some hard cases containing reshaping and strided accesses during the translation. Our experiments show promising results for the corresponding regular and irregular GA codes. (c) 2005 Elsevier B.V. All rights reserved.

关键词： OpenM/P translation global arrays parallel programming languages distributed memory system

来源：评论

学校读者我要写书评

暂无评论

Models and languages for parallel computation

引用

ACM COMPUTING SURVEYS 1998年第2期30卷 123-169页

作者： Skillicorn, DB Talia, D Queens Univ Kingston ON K7L 3N6 Canada Univ Calabria DEIS CNR ISI I-87036 Arcavacata Di Rende CS Italy

We survey parallel programming models and languages using six criteria to assess their suitability for realistic portable parallel programming. We argue that an ideal model should be easy to program, should have a software development methodology, should be architecture-independent, should be easy to understand, should guarantee performance, and should provide accurate information about the cost of programs. These criteria reflect our belief that developments in parallelism must be driven by a parallel software industry based on portability and efficiency. We consider programming models in six categories, depending on the level of abstraction they provide. Those that are very abstract conceal even the presence of parallelism at the software level. Such models make software easy to build and port, but efficient and predictable performance is usually hard to achieve. At the other end of the spectrum, low-level models make all of the messy issues of parallel programming explicit (how many threads, how to place them, how to express communication, and how to schedule communication), so that software is hard to build and not very portable, but is usually efficient. Most recent models are near the center of this spectrum, exploring the best tradeoffs between expressiveness and performance. A few models have achieved both abstractness and efficiency. Both kinds of models raise the possibility of parallelism as part of the mainstream of computing.

关键词： general-purpose parallel computation logic programming languages object-oriented languages parallel programming languages parallel programming models software development methods taxonomy

来源：评论

学校读者我要写书评

暂无评论

LAZY TASK CREATION - A TECHNIQUE FOR INCREASING THE GRANULARITY OF parallel PROGRAMS

引用

IEEE TRANSACTIONS ON parallel AND DISTRIBUTED SYSTEMS 1991年第3期2卷 264-280页

作者： MOHR, E KRANZ, DA HALSTEAD, RH DIGITAL EQUIPMENT CORP CAMBRIDGE RES LABCAMBRIDGEMA 02139 MIT COMP SCI LABCAMBRIDGEMA 02139

Many parallel algorithms are naturally expressed at a fine level of granularity, often finer than a MIMD parallel system can exploit efficiently. Most builders of parallel systems have looked to either the programmer or a parallelizing compiler to increase the granularity of such algorithms. In this paper, we explore a third approach to the granularity problem by analyzing two strategies for combining parallel tasks dynamically at runtime. We reject the simpler load-based inlining method, where tasks are combined based on dynamic load level, in favor of the safer and more robust lazy task creation method, where tasks are created only retroactively as processing resources become available. These strategies grew out of work on Mul-T [17], an efficient parallel implementation of Scheme, but could be used with other languages as well. We describe our Mul-T implementations of lazy task creation for two contrasting machines, and present performance statistics which show the method's effectiveness. Lazy task creation allows efficient execution of naturally expressed algorithms of a substantially finer grain than possible with previous parallel Lisp systems. Earlier versions of this paper appeared as [20] and [21].

关键词： LOAD BALANCING parallel programming languages parallel LISP PROCESS MIGRATION PROGRAM PARTITIONING TASK MANAGEMENT

来源：评论

学校读者我要写书评

暂无评论

A polyphase filter for many-core architectures

引用

ASTRONOMY AND COMPUTING 2016年第0期16卷 1-16页

作者： Adamek, K. Novotny, J. Armour, W. Silesian Univ Opava Fac Philosophy & Sci Inst Phys Bezrucovo Nam 13 Opava 74601 Czech Republic Univ Oxford Oxford Res Ctr E 7 Keble Rd Oxford OX1 3QG England

In this article we discuss our implementation of a polyphase filter for real-time data processing in radio astronomy. The polyphase filter is a standard tool in digital signal processing and as such a well established algorithm. We describe in detail our implementation of the polyphase filter algorithm and its behaviour on three generations of NVIDIA GPU cards (Fermi, Kepler, Maxwell), on the Intel Xeon CPU and Xeon Phi (Knights Corner) platforms. All of our implementations aim to exploit the potential for data reuse that the algorithm offers. Our GPU implementations explore two different methods for achieving this, the first makes use of L1/Texture cache, the second uses shared memory. We discuss the usability of each of our implementations along with their behaviours. We measure performance in execution time, which is a critical factor for real-time systems, we also present results in terms of bandwidth (GB/s), compute (GFLOP/s/s) and type conversions (GTc/s). We include a presentation of our results in terms of the sample rate which can be processed in real-time by a chosen platform, which more intuitively describes the expected performance in a signal processing setting. Our findings show that, for the GPUs considered, the performance of our polyphase filter when using lower precision input data is limited by type conversions rather than device bandwidth. We compare these results to an implementation on the Xeon Phi. We show that our Xeon Phi implementation has a performance that is 1.5 x to 1.92 x greater than our CPU implementation, however is not insufficient to compete with the performance of GPUs. We conclude with a comparison of our best performing code to two other implementations of the polyphase filter, showing that our implementation is faster in nearly all cases. This work forms part of the Astro-Accelerate project, a many-core accelerated real-time data processing library for digital signal processing of time-domain radio astronomy data. (C) 2016 Els

关键词： Graphics processors parallel architectures parallel programming languages parallel computing models parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Finding, expressing and managing parallelism in programs executed on clusters of workstations

引用

COMPUTER COMMUNICATIONS 1999年第11期22卷 998-1016页

作者： Goscinski, AM Deakin Univ Sch Comp & Math Geelong Vic 3217 Australia

The goal of this paper to identify and discuss the basic issues of and solutions to parallel processing on clusters of workstations (COWs). Firstly, identification and expressing parallelism in application programs are discussed. The following approaches to finding and expressing parallelism are characterized: parallel programming languages, parallel programming tools, sequential programming supported by distributed shared memory (DSM), and parallelising compilers. Secondly, efficient management of available parallelism is discussed. As parallel execution requires an efficient management of processes and computational resources, a parallel execution environment proposed here is to be built based on a distributed operating system. This system, in order to allow parallel programs to achieve high performance and transparency, should provide services such as global scheduling, process migration, local and remote process creation, computation coordination, group communication and distributed shared memory. (C) 1999 Elsevier Science B.V. All rights reserved.

关键词： parallel programming languages parallel programming tools DSM parallelizing compilers parallelism management distributed operating systems supporting parallelism management

来源：评论

学校读者我要写书评

暂无评论

A parallel language and its programming system far heterogeneous networks

引用

CONCURRENCY-PRACTICE AND EXPERIENCE 2000年第13期12卷 1317-1343页

作者： Lastovetsky, A Arapov, D Kalinov, A Ledovskikh, I Russian Acad Sci Inst Syst Programming Moscow 109004 Russia

The paper presents a new parallel language, mpC, designed specially for programming high-performance computations on heterogeneous networks of computers, as well as its supportive programming environment. The main idea underlying mpC is that an mpC application explicitly defines an abstract network and distributes data, computations and communications over the network, The mpC programming environment uses, at run time, this information as well as information on any real executing network in order to map the application to the real network in such a way that ensures the efficient execution of the application on this real network. Experience of using mpC for solving both regular and irregular real-life problems on networks of heterogeneous computers is also presented. Copyright (C) 2000 John Wiley & Sons, Ltd.

关键词： parallel architectures parallel programming languages software tools heterogeneous computing efficient portable modular parallel programming

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：