检索结果-内蒙古大学图书馆

MODELS FOR PRACTICAL parallel COMPUTATION

INTERNATIONAL JOURNAL OF parallel programming 1991年第2期20卷 133-158页

作者： SKILLICORN, DB 1. Department of Computing and Information Science Queen's University Kingston Canada

A major reason for the lack of practical use of parallel computers has been the absence of a suitable model of parallel computation. Many existing models are either theoretical or are tied to a particular architecture. A more general model must be architecture independent, must realistically reflect execution costs, and must reduce the cognitive overhead of managing massive parallelism. A growing number of models meeting some of these goals have been suggested. We discuss their properties and relative strengths and weaknesses. We conclude that data parallelism is a style with much to commend it, and discuss the Bird-Meertens formalism as a coherent approach to data parallel programming.

关键词： MODELS OF parallel COMPUTATION ARCHITECTURE-INDEPENDENT programming PRAM GRAPH REDUCTION UNITY ACTION SYSTEMS LINDA ACTORS PRAM EXTENSIONS data parallel programming BIRD-MEERTENS FORMALISM

来源：评论

学校读者我要写书评

暂无评论

Which approach to parallelizing scientific codes - That is the question

引用

parallel COMPUTING 1997年第1-2期23卷 165-180页

作者： Berthou, JY Colombet, L CEA DI CISI F-38054 GRENOBLE 9 FRANCE

We present in this paper the strong points and limitations of semi-automatic parallelization, data parallel programming and message passing programming. We apply these on two numerical algorithms namely a bi-dimensional Fourier transform algorithm and a conjugate gradient programs. We implemented this program for each of the different methods on a Gray T3D. The results of these experiments demonstrate the accuracy of our proposition that when the three methods are combined, efficiency, portability and easiness of parallel programming may be achieved.

关键词： automatic parallelization message passing programming data parallel programming HPF MPI PVM

来源：评论

学校读者我要写书评

暂无评论

Support and optimization for parallel sparse programs with array intrinsics of Fortran 90

引用

parallel COMPUTING 2004年第4期30卷 527-550页

作者： Chang, RG Chuang, TR Lee, JK Natl Tsing Hua Univ Dept Comp Sci Hsinchu 30043 Taiwan Acad Sinica Inst Informat Sci Taipei 115 Taiwan Natl Chung Cheng Univ Dept CSIE Hsinchu Taiwan

Fortran 90 provides a rich set of array intrinsic functions that are useful for representing array expressions and data parallel programming. However, the application of these intrinsic functions to sparse data sets in distributed memory environments, is currently not supported by vendors of Fortran 90 and HPF compilers. Our recent research work has been aimed at, providing parallel processing supports for sparse array intrinsics of Fortran 90. Our supporting library uses the following two-level design: (1) in our low-level routines, a sparse input matrix needs to be specified with compression/distribution schemes by programmers, and (2) in the high-level representation, sparse array functions are overloaded for array intrinsic interfaces so that programmers need not be concerned about low-level details. This raises a very interesting optimization problem in the strategies used to transform high-level representations to low-level routines by the automatic selection of distribution and compression schemes for sparse data sets. In this paper, we propose solutions to address this optimization problem, which is shown to be NP-hard. We develop a heuristic algorithm based on annotated program graphs. To the best of our knowledge, our selection scheme, is the first to automatically select compression and distribution schemes for sparse data arrays with the array intrinsics of Fortran 90 in a distributed memory environment. Experimental results show that our selection algorithms are consistent with our cost model, and effective in selecting appropriate compression and distribution schemes for improving the performance of application programs that operate on sparse data sets. Our experiments were performed on an IBM SP-2 machine using our parallel sparse array intrinsics for Fortran 90. (C) 2004 Published by Elsevier B.V.

关键词： parallel sparse compiler Fortran 90 array intrinsics distributed environments optimizing compilers sparse computations data parallel programming

来源：评论

学校读者我要写书评

暂无评论

A semantic framework to address data locality in data parallel languages

引用

parallel COMPUTING 2004年第1期30卷 139-161页

作者： Violard, E Univ Strasbourg LSIIT ICPS Strasbourg France

We developed a theory in order to address crucial questions of program design methodology. This theory deals with data locality which is a main issue in parallel programming. In this article, we regard this theory and its model as a minimum semantic domain for data parallel languages. The introduction of a semantic domain is justified because the classical data parallel languages (HPF and C*) have different intuitive semantics: Indeed, they use different concepts in order to express data locality. These concepts are alignment in HPF and shape in C*. Consequently these two languages define their own balance between compiler and programmer investments in order to reach program efficiency. We present our theory as a foundation for defining a better balance. (C) 2003 Elsevier B.V. All rights reserved.

关键词： data parallel programming equational languages semantics parallel programs design data locality

来源：评论

学校读者我要写书评

暂无评论

BENCHMARK OF APPLICATION SOFTWARE KERNELS ON THE SUPERNODE SN1000 USING THE 3P PARLIB

引用

FUTURE GENERATION COMPUTER SYSTEMS 1995年第1期11卷 87-109页

作者： CORNUBERT, R GRUEZ, G STEINFELD, P ZNATY, E BERTIN et Cie 59 rue Pierre Curie BP 3 78373 Plaisir Cedex France

This article presents the benchmarking by BERTIN (F) of the SUPERNODE SN1000 parallel architecture from PARSYS within the framework of the BECAUSE Project. This evaluation of a Distributed Memory parallel architecture was realised by means of the BECAUSE Benchmark Set (BBS). The very strong idea was to specify parallelisation methodologies and to develop parallel software which are machine independent and as such portable. This approach was possible and realistic since the principle of parallelism which is involved is the data parallel programming. As a consequence, the hardware features of the target architecture are transparent to the industrial user and are managed through a communication library called 3P PARLIB. In this paper, principles of parallelisation which were used are presented. Practical implementation of these parallelisation principles is illustrated with various significant Test Programs from the BBS. The corresponding results are presented. Specifications for the 3P PARLIB (Portable parallel programming library) are also given.

关键词： BECAUSE PROJECT BENCHMARKING DISTRIBUTED MEMORY MIMD parallel ARCHITECTURES data parallel programming data MAPPING DOMAIN DECOMPOSITION METHOD (DDM)

来源：评论

学校读者我要写书评

暂无评论

data-parallel GEOMETRIC OPERATIONS ON LISTS

引用

parallel COMPUTING 1995年第3期21卷 447-459页

作者： KUMAR, KG SKILLICORN, DB QUEENS UNIV DEPT COMP & INFORMAT SCIKINGSTONON K7L 3N6CANADA

We describe data parallel list operations that exploit pair structure on lists and an algebra that relates them. Equations from the algebra are used as transformation rules, so that development is done in a calculational way. We illustrate their use in applications such as FFTs and sorting, and show that optimal or near-optimal algorithms can result from a systematic calculational process. The operations have a natural, direct implementation on hypercubes.

关键词： data parallel programming PROGRAM TRANSFORMATION FFT SORTING HYPERCUBES

来源：评论

学校读者我要写书评

暂无评论

OPTIMAL EVALUATION OF ARRAY EXPRESSIONS ON MASSIVELY-parallel MACHINES

引用

ACM TRANSACTIONS ON programming LANGUAGES AND SYSTEMS 1995年第1期17卷 123-156页

作者： CHATTERJEE, S GILBERT, JR SCHREIBER, R TENG, SH XEROX CORP PALO ALTO RES CTR PALO ALTO CA 94304 USA NASA AMES RES CTR RIACS MOFFETT FIELD CA 94035 USA UNIV MINNESOTA DEPT COMP SCI MINNEAPOLIS MN 55455 USA

We investigate the problem of evaluating Fortran 90-style array expressions on massively parallel distributed-memory machines. On such a machine, an elementwise operation can be performed in constant time for arrays whose corresponding elements are in the same processor. If the arrays are not aligned in this manner, the cost:of aligning them is part of the cost of evaluating the expression tree. The choice of where to perform the operation then affects this cost. We describe the communication cost of the parallel machine theoretically as a metric space;we model the alignment problem as that of finding a minimum-cost embedding of the expression tree into this space. We present algorithms based on dynamic programming that solve the embedding problem optimally for several communication cost metrics: multidimensional grids and rings, hypercubes, fat-trees, and the discrete metric. We also extend out approach to handle operations that change the shape of the arrays.

关键词： ALGORITHMS LANGUAGES THEORY ARRAY ALIGNMENT COMPACT DYNAMIC programming data parallel programming DISTRIBUTED MEMORY parallel PROCESSORS FIXED TOPOLOGY STEINER TREE FORTRAN 90

来源：评论

学校读者我要写书评

暂无评论

SAC - A functional array language for efficient multi-threaded execution

引用

INTERNATIONAL JOURNAL OF parallel programming 2006年第4期34卷 383-427页

作者： Grelck, Clemens Scholz, Sven-Bodo Univ Lubeck Inst Software Technol & Programming Languages D-23538 Lubeck Germany Univ Hertfordshire Dept Comp Sci Hatfield AL10 9AB Herts England

We give an in-depth introduction to the design of our functional array programming language SAC, the main aspects of its compilation into host machine code, and its parallelisation based on multi-threading. The language design of SAC aims at combining high-level, compositional array programming with fully automatic resource management for highly productive code development and maintenance. We outline the compilation process that maps SAC programs to computing machinery. Here, our focus is on optimisation techniques that aim at restructuring entire applications from nested compositions of general fine-grained operations into specialised coarse-grained operations. We present our implicit parallelisation technology for shared memory architectures based on multi-threading and discuss further optimisation opportunities on this level of code generation. Both optimisation and parallelisation rigorously exploit the absence of side-effects and the explicit data flow characteristic of a functional setting.

关键词： compiler optimisation data parallel programming multi-threading Single Assignment C

来源：评论

学校读者我要写书评

暂无评论

dataCENTER-SCALE COMPUTING Introduction

引用

IEEE MICRO 2010年第4期30卷 6-7页

作者： Barroso, Luiz Andre Ranganathan, Parthasarathy Google Mountain View CA 94043 USA HP Labs Mississauga ON Canada

Although the field of datacenter computing is arguably still in its relative infancy, a sizable body of work from both academia and industry is already available and some consistent technological trends have begun to emerge. This special issue presents a small sample of the work underway by researchers and professionals in this new field. The selection of articles presented reflects the key role that hardware-software codesign plays in the development of effective datacenter-scale computer systems.

关键词： datacenter Computing Hardware Multicore data parallel programming Networking Storage Energy Efficiency

来源：评论

学校读者我要写书评

暂无评论

C++ components describing parallel domain decomposition and communication

引用

INTERNATIONAL JOURNAL OF parallel EMERGENT AND DISTRIBUTED SYSTEMS 2009年第6期24卷 467-477页

作者： Blatt, Markus Bastian, Peter Ruprechts Karls Univ Heidelberg Interdisziplinares Zentrum Wissensch Rech IWR Neuenheimer Feld 368 D-69120 Heidelberg Germany

Large-scale parallel codes require the data to be decomposed between the set of processes active in the computation. This data decomposition implies recurring communication schemes. T he paper introduces generic template classes in C++ for describing the data decomposition. The aim is to store the data in arbitrary existent efficient sequential data structures. Each entry in the sequential data structure corresponds to an entry in the virtual global view of the container. Once the decomposition is setup the needed communication schemes can be created automatically and can be used to communicate values from containers of various types. Even containers with a varying number of values associated with an entry are possible. The framework abstracts the decomposition information and the communication in the client code from the eventual parallel paradigm choice. A prototype based on Message Passing Interface standard is presented. It relieves the user from specifying information that is already known at compile time.

关键词： domain decomposition methods data parallel programming objectoriented programming generic programming finite element methods iterative solvers

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：