检索结果-内蒙古大学图书馆

Designing scalable object oriented parallel applications 8th

8th International Euro-Par Conference on parallel Processing, Euro-Par 2002

作者： Sobral, João Luís Proença, Alberto José Departamento de Informática - Universidade do Minho Braga4710 - 057 Portugal

ISBN: (纸本)3540440496

The SCOOPP (Scalable Object Oriented parallel programming) system efficiently adapts, at run-time, an object oriented parallel application to any distributed memory system. It extracts as much parallelism as possible at compile time, and it removes excess of parallel tasks and messages through run-time packing. These object and call aggregation techniques are briefly presented. A design methodology was developed for three main types of scalable applications: pipeline, divide & conquer and farming. This paper reviews how the method can help programmers to design portable and efficient parallel applications. It details its application to a farming case study (image threshold) with measured performance data, and compares with programmer’s tuned versions in a Pentium cluster. © Springer-Verlag Berlin Heidelberg 2002.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Design and prototype of a performance tool interface for OpenMP

Design and prototype of a performance tool interface for Ope...

引用

2nd Symposium of the Los-Alamos-Computer-Science-Institute

作者： Mohr, B Malony, AD Shende, S Wolf, F ZAM Res Ctr Julich Julich Germany Univ Oregon Dept Comp & Informat Sci Eugene OR 97403 USA

This paper proposes a performance tools interface for OpenMP, similar in spirit to the MPI profiling interface in its intent to define a clear and portable API that makes OpenMP execution events visible to runtime performance tools. We present our design using a source-level instrumentation approach based on OpenMP directive rewriting. Rules to instrument each directive and their combination are applied to generate calls to the interface consistent with directive semantics and to pass context information (e.g., source code locations) in a portable and efficient way. Our proposed OpenMP performance API further allows user functions and arbitrary code regions to be marked and performance measurement to be controlled using new OpenMP directives. To prototype the proposed OpenMP performance interface, we have developed compatible performance libraries for the Expert automatic event trace analyzer [17, 18] and the TAU performance analysis framework [13]. The directive instrumentation transformations we define are implemented in a source-to-source translation tool called OPARI. Application examples are presented for both Expert and TAU to show the OpenMP performance interface and OPARI instrumentation tool in operation. When used together with the MPI profiling interface (as the examples also demonstrate), our proposed approach provides a portable and robust solution to performance analysis of OpenMP and mixed-mode (OpenMP+MPI) applications.

关键词： performance analysis parallel programming OpenMP

来源：评论

学校读者我要写书评

暂无评论

Performance comparisons of basic OpenMP constructs

Performance comparisons of basic OpenMP constructs

引用

4th International Symposium on High Performance Computing, ISHPC 2002

作者： Prabhakar, Achal Getov, Vladimir Chapman, Barbara Parallel Architectures and Performance Team CCS-3 Los Alamos National Laboratory NM United States Department of Computer Science University of Houston TX United States School of Computer Science University of Westminster United Kingdom

ISBN: (纸本)354043674X

OpenMP has become the de-facto standard for shared memory parallel programming. The directive based nature of OpenMP allows incremental and portable developement of parallel application for a wide range of platforms. The fact that OpenMP is easy to use implies that a lot of details are hidden from the end user. Therefore, basic factors like the runtime system, compiler optimizations and other implementation specific issues can have a significant impact on the performance of an OpenMP application. Frequently, OpenMP constructs can have widely varying performance on different operating platforms and even with different compilers on the same machine. This makes it very important to have a comparative study of the low-level performance of individual OpenMP constructs. In this paper, we present an enhanced set of microbenchmarks for OpenMP derived from the EPCC benchmarks and based on the SKaMPI benchmarking framework. We describe the methodology of evaluation followed by details of some of the constructs and their performance measurement. Results from experiments conducted on the IBM SP3 and the SUN SunFire systems are presented for each construct. © Springer-Verlag Berlin Heidelberg 2002.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

parallel programming by transformation

Parallel programming by transformation

引用

5th International Euro-Par Conference

作者： Winstanley, N Univ Glasgow Dept Comp Sci Glasgow G12 8QQ Lanark Scotland

ISBN: (纸本)3540664432

This paper presents a system to produce efficient implementations of parallel array-based algorithms from high-level specifications. It is structured as a transformation through a series of progressively more detailed representations. This allows the use of high-level programming features without losing the fine control of low-level languages. During the transformation process, parallel implementation decisions are introduced. Finally, a representation is reached which can be translated to C+MPI.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

PROGRAM DEVELOPMENT FOR COMPUTATIONAL GRIDS USING SKELETONS AND PERFORMANCE PREDICTION

引用

parallel Processing Letters 2002年第2期12卷 157-174页

作者： MARTIN ALT HOLGER BISCHOF SERGEI GORLATCH Technische Universität Berlin Sekr. FR 5-6 Franklinstr. 28/29 10587 Berlin Germany

We address the challenging problem of algorithm and program design for the Computational Grid by providing the application user with a set of high-level, parameterised components called skeletons . We descrile a Java-based Grid programming system in which algorithmns are composed of skeletons and the computational resources for executing individual skeletons are chosen using performance prediction. The advantage of our approach is that skeletons are reusable for different applications and that skeletons' implementation can be tuned to particular machines. The focus of this paper is on predicting performance for Grid applications constructed using skeletons.

关键词： parallel programming grid computing program design Java performance prediction

来源：评论

学校读者我要写书评

暂无评论

Static performance prediction of skeletal parallel programs

引用

parallel Algorithms and Applications. 2002年第1期17卷 59-84页

作者： Murray Cole Yasushi Hayashi [a] Institute for Computing Systems Architecture Division of Informatics University of Edinburgh Edinburgh

We demonstrate that the run time of implicitly parallel programs can be statically predicted with considerable accuracy when expressed within the constraints of a skeletal, shapely parallel programming language. Our work constitutes the first completely static system to account for both computation and Communication in such a context. We present details of our language and its BSP implementation strategy together with an account of the analysis mechanism. We examine the accuracy of our predictions against the performance of real parallel programs.

关键词： Skelectons Shapes Cost modelling BSP parallel programming static system Language Runtime

来源：评论

学校读者我要写书评

暂无评论

A state-wide senior parallel programming course

引用

IEEE TRANSACTIONS ON EDUCATION 1999年第3期42卷 167-173页

作者： Wilkinson, B Allen, M Univ N Carolina Dept Comp Sci Charlotte NC 28223 USA

In this paper, vee describe an undergraduate parallel programming course based upon networked workstations, The course is offered on the North Carolina Research and Education Network (NC-REN), a private telecommunications network which interconnects universities in North Carolina and provides multiway, face-to-face video, anti audio communications. Course materials are described and made available in a new textbook, Topics are divided into basic techniques and applications. In addition, extensive home page materials are described.

关键词： cluster computing MPI parallel programming PVM

来源：评论

学校读者我要写书评

暂无评论

Anonymous remote computing: A paradigm for parallel programming on interconnected workstations

引用

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING 1999年第1期25卷 75-90页

作者： Joshi, RK Ram, DJ Indian Inst Technol Dept Comp Sci & Engn Bombay 400076 Maharashtra India Indian Inst Technol Dept Comp Sci & Engn Madras 600036 Tamil Nadu India

parallel computing on interconnected workstations is becoming a viable and attractive proposition due to the rapid growth in speeds of interconnection networks and processors. In the case of workstation clusters, there is always a considerable amount of unused computing capacity available in the network. However, heterogeneity in architectures and operating systems, load variations on machines, variations in machine availability, and failure susceptibility of networks and workstations complicate the situation for the programmer. In this context, new programming paradigms that reduce the burden involved in programming for distribution, load adaptability, heterogeneity, and fault tolerance gain importance. This paper identifies the issues involved in parallel computing on a network of workstations. The Anonymous Remote Computing (ARC) paradigm is proposed to address the issues specific to parallel programming on workstation systems. ARC differs from the conventional communicating process model by treating a program as one single entity consisting of several loosely coupled remote instruction blocks instead of treating it as a collection of processes. The ARC approach results in distribution transparency and heterogeneity transparency. At the same time, it provides fault tolerance and load adaptability to parallel programs on workstations. ARC is developed in a two-tiered architecture consisting of high level language constructs and low level ARC primitives. The paper describes an implementation of the ARC kernel supporting ARC primitives.

关键词： anonymous remote computing cluster computing remote instruction block parallel programming

来源：评论

学校读者我要写书评

暂无评论

Dynamic load-balancing for BSP Time Warp

Dynamic load-balancing for BSP Time Warp

引用

Annual Symposium on Simulation

作者： M.Y.H. Low Programming Research Group Computing Laboratory University of Oxford Oxford UK

The performance of a parallel simulation system depends very much on partitioning simulation workload evenly among the set of processors in the computing environment to ensure load-balance between processors. Most parallel simulation systems employ user-defined static partitioning. However static partitioning requires in-depth domain knowledge of the specific simulation model in the study. It is not effective if the workload of a simulation model could not be quantified accurately or changes over time during a simulation run. Dynamic load-balancing allows the simulation system to automatically balance the workload of different simulation models without user's input. In this paper the use of dynamic load-balancing in the context of the BSP Time Warp optimistic protocol is examined. Based on the BSP cost model, a dynamic load-balancing algorithm for the BSP Time Warp protocol is developed. Using different simulation models, the paper shows that to achieve consistent performance, the dynamic load-balancing algorithm for BSP Time Warp needs to consider both computation and communication workload, as well as lookaheads between processors.

关键词： Computational modeling Costs Protocols Concurrent computing Heuristic algorithms Time warp simulation Computational efficiency Algorithm design and analysis parallel programming Load modeling

来源：评论

学校读者我要写书评

暂无评论

Resource monitoring for cluster computing with application to parallel motion estimation

Resource monitoring for cluster computing with application t...

引用

IEEE Asia-Pacific Conference on Circuits and Systems

作者： T.S. Gunawan Cai Wen Tong School of Computer Engineering Nanyang Technological University Singapore Singapore

In recent years, cluster computing has been accepted widely as a parallel platform because of its high performance at an affordable cost. To make the best use of the cluster computing resources, a resource monitoring program is needed. The information collected can be used by any parallel application, i.e. parallel motion estimation, for handling load variation in typical time-sharing computers. Therefore, the parallel workload can be distributed properly among n processors. In this paper, we present the development of resource monitoring for cluster computing using the MPI programming model and its application to parallel motion estimation. Results show the effectiveness of our method in which a faster parallel execution time can be achieved.

关键词： Computer applications Concurrent computing Motion estimation Computerized monitoring Application software Time sharing computer systems High performance computing Costs Load management parallel programming

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：