检索结果-内蒙古大学图书馆

3rd International Conference on parallel and Distributed Processing and Applications

作者： Hunold, S Rauber, T Univ Bayreuth Dept Math Phys & Comp Sci Bayreuth Germany

ISBN: (纸本)3540297693

This article presents the C++ library vShark which reduces the intranode communication overhead of parallel programs on clusters of SMPs. The library is built on top of message-passing libraries like MPI to provide thread-safe communication but most importantly, to improve the communication between threads within one SMP node. vShark uses a modular but transparent design which makes it independent of specific communication libraries, Thus, different subsystems such as MPI, CORBA, or PVM could also be used for low-level communication. We present an implementation of vShark based on MPI and the POSIX thread library, and show that the efficient intra-node communication of vShark improves the performance of parallel algorithms.

关键词： clusters of SMPs parallel programming models message passing between threads

来源：评论

学校读者我要写书评

暂无评论

A comparison of MPI, SHMEM and cache-coherent shared address space programming models on a tightly-coupled multiprocessors

引用

INTERNATIONAL JOURNAL OF parallel programming 2001年第3期29卷 283-318页

作者： Shan, HZ Singh, JP Princeton Univ Dept Comp Sci Princeton NJ 08544 USA

We compare the performance of three major programming models on a modern, 64-processor hardware cache-coherent machine, one of the two major types of platforms upon which high-performance computing is converging. We focus on applications that are either regular, predictable or at least do not require fine-grained dynamic replication of irregularly accessed data. Within this class, we use programs with a range of important communication patterns. We examine whether the basic parallel algorithm and communication structuring approaches needed for best performance are similar or different among the models, whether some models have substantial performance advantages over others as problem size and number of processors change, what the sources of these performance differences are, where the programs spend their time, and whether substantial improvements can be obtained by modifying either the application programming interfaces or the implementations of the programming models on this type of tightly-coupled multiprocessor platform.

关键词： parallel programming models performance ease of programming shared memory message passing

来源：评论

学校读者我要写书评

暂无评论

SkIE: A heterogeneous environment for HPC applications

引用

parallel COMPUTING 1999年第13-14期25卷 1827-1852页

作者： Bacci, B Danelutto, M Pelagatti, S Vanneschi, M Univ Pisa Dipartimento Informat I-56125 Pisa Italy Quadr Supercomp World Ltd I-56125 Pisa Italy

Technological directions for innovative HPC software environments are discussed in this paper. We focus on industrial user requirements of heterogeneous multidisciplinary applications, performance portability, rapid prototyping and software reuse, integration and interoperability of standard tools. The Various issues are demonstrated with reference to the PQE2000 project and its programming environment Skeleton-based Integrated Environment (SkIE), SkIE includes a coordination language, SkIECL, allowing the designers to express, in a primitive and structured way, efficient combinations of data parallelism and task parallelism. The goal is achieving fast development and good efficiency for applications in different areas. Modules developed with standard languages and tools are encapsulated into SkIECL structures to form the global application. Performance models associated to the coordination language allow powerful optimizations to be introduced both at run time and at compile time without the direct intervention of the programmer. The paper also discusses the features of the SkIE environment related to debugging, performance analysis tools, visualization and graphical user interface. A discussion of the results achieved in some applications developed using the environment concludes the paper. (C) 1999 Elsevier Science B.V. All rights reserved.

关键词： parallel programming environments parallel programming models structured parallel programming

来源：评论

学校读者我要写书评

暂无评论

Nested data parallelism - An introductory overview

引用

Informatik - Forschung und Entwicklung 1999年第4期14卷 179-192页

作者： Pfannenstiel, W. Technische Universität Berlin Fachbereich Informatik Fachgebiet Softwaretechnik D-10587 Berlin Franklinstraße 28/29 Germany

Today, data-parallel programming models are the most successful programming models for parallel computers both in terms of efficiency of execution and ease of use for the programmer. However, there is no parallel programming model that is conceptually simple and abstract, and that can be ported efficiently to the variety of parallel architectures available. The nested data-parallel programming model has some of the desired properties of a parallel programming model. In contrast to flat data parallel models, with this model it is possible to express irregular data structures and irregular parallel computations directly. In this paper, a collection-oriented approach to nested data parallelism is introduced. The state of the art of related research is presented and open questions are identified.

关键词： High-level parallel languages Irregular parallelism Massively parallel systems Nested parallelism parallel programming models Program transformations

来源：评论

学校读者我要写书评

暂无评论

models and languages for parallel computation

引用

ACM COMPUTING SURVEYS 1998年第2期30卷 123-169页

作者： Skillicorn, DB Talia, D Queens Univ Kingston ON K7L 3N6 Canada Univ Calabria DEIS CNR ISI I-87036 Arcavacata Di Rende CS Italy

We survey parallel programming models and languages using six criteria to assess their suitability for realistic portable parallel programming. We argue that an ideal model should be easy to program, should have a software development methodology, should be architecture-independent, should be easy to understand, should guarantee performance, and should provide accurate information about the cost of programs. These criteria reflect our belief that developments in parallelism must be driven by a parallel software industry based on portability and efficiency. We consider programming models in six categories, depending on the level of abstraction they provide. Those that are very abstract conceal even the presence of parallelism at the software level. Such models make software easy to build and port, but efficient and predictable performance is usually hard to achieve. At the other end of the spectrum, low-level models make all of the messy issues of parallel programming explicit (how many threads, how to place them, how to express communication, and how to schedule communication), so that software is hard to build and not very portable, but is usually efficient. Most recent models are near the center of this spectrum, exploring the best tradeoffs between expressiveness and performance. A few models have achieved both abstractness and efficiency. Both kinds of models raise the possibility of parallelism as part of the mainstream of computing.

关键词： general-purpose parallel computation logic programming languages object-oriented languages parallel programming languages parallel programming models software development methods taxonomy

来源：评论

学校读者我要写书评

暂无评论

High-level programming of massively parallel computers based on shared virtual memory

引用

parallel COMPUTING 1998年第3-4期24卷 383-400页

作者： Gerndt, M Res Ctr Julich Cent Inst Appl Math D-52425 Julich Germany

Highly parallel machines needed to solve compute-intensive scientific applications are based on the distribution of physical memory across the compute nodes. The drawback of such systems is, the necessity to write applications in the message passing programming model. Therefore, a lot of research is going on in higher-level programming models and supportive hardware, operating system techniques, languages. The research direction outlined in this article is based on shared virtual memory systems, i.e., scalable parallel systems with a global address space which support an adaptive mapping of global addresses to physical memories. We introduce programming concepts and program optimizations for SVM systems in the context of the SVM-Fortran programming environment which is based on a shared virtual memory system implemented on Intel Paragon. The performance results for real applications proved that this environment enables users to obtain a similar or better performance than by progamming in HPF. (C) 1998 Elsevier Science B.V. All rights reserved.

关键词： distributed memory computers scientific computing shared virtual memory parallel programming models language constructs for data locality optimization performance analysis tools

来源：评论

学校读者我要写书评

暂无评论

Associative nets: A graph-based parallel computing model

引用

IEEE TRANSACTIONS ON COMPUTERS 1997年第5期46卷 558-571页

作者： Merigot, A UNIV PARIS 11 CNRSURA22 INTEGRATED CIRCUITS & SYST ARCHITECTURE GRP FUNDAMENTAL ELECT INST ORSAY FRANCE

This paper presents a new parallel computing model called Associative Nets. This model relies on basic primitives called associations that consist to apply an associative operator over connected components of a subgraph of the physical interprocessor connection graph. Associations can be very efficiently implemented (in terms of hardware cost or processing time) thanks to asynchronous computation. This model is quite effective for image analysis and several other fields;as an example, graph processing algorithms are presented. While relying on a much simpler architecture, these algorithms have, in general, a complexity equivalent to the one obtained by more expensive computing models, like the PRAM model.

关键词： parallel programming models parallel algorithms fine grain parallelism SIMD bit-serial arithmetic graph processing asynchronous logic

来源：评论

学校读者我要写书评

暂无评论

Global arrays: A nonuniform memory access programming model for high-performance computers

引用

JOURNAL OF SUPERCOMPUTING 1996年第2期10卷 169-189页

作者： Nieplocha, J Harrison, RJ Littlefield, RJ Pacific Northwest National Laboratory Richland USA

Portability efficiency, and ease of coding are all important considerations in choosing the programming model for a scalable parallel application. The message-passing programming model is widely used because of its portability, yet some applications are too complex to code in it while also trying to maintain a balanced computation load and avoid redundant computations. The shared-memory programming model simplifies coding, but it is not portable and often provides little control over interprocessor data transfer costs. This paper describes an approach, called Global Arrays (GAs), that combines the better features of both other models, leading to both simple coding and efficient execution. The key concept of GAs is that they provide a portable interface through which each process in a MIMD parallel program can asynchronously access logical blocks of physically distributed matrices, with no need for explicit cooperation by other processes. We have implemented the GA library on a variety of computer systems, including the Intel Delta and Paragon, the IBM SP-1 and SP-2 (all message passers), the Kendall Square Research KSR-1/2 and the Convex SPP-1200 (nonuniform access shared-memory machines), the GRAY T3D (a globally addressable distributed-memory computer), and networks of UNIX workstations. We discuss the design and implementation of these libraries, report their performance, illustrate the use of GAs in the context of computational chemistry applications, and describe the use of a GA performance Visualization tool.

关键词： NUMA architecture parallel programming models shared memory parallel programming environments distributed arrays Global Arrays one-sided communication scientific computing Grand Challenges computational chemistry

来源：评论

学校读者我要写书评

暂无评论

POLYMORPHIC PROCESSOR ARRAYS

引用

IEEE TRANSACTIONS ON parallel AND DISTRIBUTED SYSTEMS 1993年第5期4卷 490-506页

作者： MARESCA, M INT COMP SCI INST I-16145 GENOAITALY

A Polymorphic Processor Array (PPA) is a two-dimensional mesh-connected array of processors, in which each processor is equipped with a switch able to interconnect its four NEWS ports. PPA is an abstract architecture based upon the experience acquired in the design and in the implementation of a VLSI chip, namely the Polymorphic Torus (PT) chip, and, as a consequence, it only includes capabilities that have been proved to be supported by cost-effective hardware structures. The main claims of PPA are that 1) it models a realistic class of parallel computers, 2) it supports the definition of high level programming models, 3) it supports virtual parallelism and 4) it supports low complexity algorithms in a number of application fields. In this paper we present both the PPA computation model and the PPA programming model;we show that the PPA computation model is realistic by relating it to the design of the PT chip and show that the PPA programming model is scalable by demonstrating that any algorithm having O(p) complexity on a virtual PPA of size square-root m x square-root m, has O(kp) complexity on a PPA of size square-root n x square-root n, with m = kn and k integer. We finally show some application algorithms in the area of numerical analysis and graph processing.

关键词： DATA parallel ALGORITHMS MASSIVELY parallel COMPUTERS MESH CONNECTED COMPUTERS parallel COMPUTATION models parallel programming models RECONFIGURABLE parallel COMPUTERS SIMD ARCHITECTURES

来源：评论

学校读者我要写书评

暂无评论

A TEMPLATE-BASED APPROACH TO THE GENERATION OF DISTRIBUTED APPLICATIONS USING A NETWORK OF WORKSTATIONS

引用

IEEE TRANSACTIONS ON parallel AND DISTRIBUTED SYSTEMS 1991年第1期2卷 52-67页

作者： SINGH, A SCHAEFFER, J GREEN, M Distributed Systems Research Laboratory Department of Computing Science University of Alberta Edmonton AB Canada

Despite rapid growth in workstation and networking technologies, the workstation environment continues to pose challenging problems to shared processing. In this paper, we present a computational model and system for the generation of distributed applications in such an environment. The well-known RPC model is modified by a novel concept known as template attachment. A computation consists of a network of sequential procedures which have been encapsulated in templates. A small selection of templates is available from which a distributed application with the desired communication behavior can be rapidly built. The system generates all the required low-level code for correct synchronization, communication, and scheduling. This results in a system that is easy to use and flexible, and can provide a programmer with the desired amount of control in using idle processing power over a network of workstations. The practical feasibility of the model has been demonstrated by implementing it for Unix1-based workstation environments.

关键词： COARSE GRAIN CONCURRENCY DISTRIBUTED COMPUTING DISTRIBUTED SOFTWARE ENGINEERING NETWORK SYSTEMS parallel programming parallel programming models WORKSTATION ENVIRONMENT

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：