检索结果-内蒙古大学图书馆

Exploiting functional directives to achieve MPI parallelization

Civil-Comp Proceedings 2017年 111卷 1-19页

作者： Rubio Bonilla, D. Glass, C.W. HLRS University of Stuttgart Germany

In the last years CPU manufactures have not been able to substantially increase the Instructions Per Cycle of CPU cores. Trying to overcome this situation manufacturers have increased the raw performance of HPC systems by simultaneously increasing the amount of processors, by multiplying the number of cores in each processor and integrating specialized accelerators such as GPGPUs, FPGAs and other ASICs with specialized instruction sets. To be able to exploit the new hardware capabilities applications have to be specifically written with parallelism, to deal with the increasing number of cores available and also need to have parts of the source code written in specialized languages to make use of the integrated accelerators. This creates a major paradigm shift from compute centric to communication centric execution to which most programming models are not properly aligned yet: Classical models are geared towards optimizing the computational operations, assuming data access is almost for free. Languages like C, for example, assume that variables are immediately available and are accessed synchronously. The new situation implies that the data is going to be distributed across the system, and communication latency will have a big impact. A bad data distribution will result in a continue exchange while processing units are idling waiting for the data to compute. Most programming models do not convey the necessary dependency information to the compiler that must be careful not to make wrong assumptions. There are successful attempts, such as OpenMP, to exploit parallelism by introducing structural information of the application. Research projects, such as POLCA, have develop means to introduce functionallike semantics, in the form of directives, to procedural code that describe the structural behavior of the application with the aim to allow compilers to perform aggressive code transformations that increase performance and allow portability across different architectures.

关键词： C (programming language) Application programming interfaces (API) Codes (symbols) Cosine transforms Data flow analysis Energy utilization Functional programming Memory architecture Message passing Parallel processing systems Program compilers Semantics Computational operations Dependency informations Different architectures Hierarchical structures High performance computing Instructions per cycles Parallelizations programming models

来源：评论

学校读者我要写书评

暂无评论

Parallel programming Patterns for Multi-Processor SoC: Application to Video Processing

引用

ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS 2013年第1s期12卷 47-47页

作者： Paulin, Pierre G. Oezcan, Ali Erdem Gagne, Vincent Lavigueur, Bruno Benny, Olivier STMicroelect Canada Inc 16 Fitzgerald RdSuite 100 Ottawa ON K2H 8R6 Canada

Efficient, scalable and productive parallel programming is a major challenge for exploiting the future multiprocessor SoC platforms. This article presents the MultiFlex programming environment which has been developed to address this challenge. It is targeted for use on Platform 2012, a scalable multi-processor fabric. The MultiFlex environment supports high-level simulation, iterative platform mapping, and includes tools for programming model aware debug, trace, visualization and analysis. This article focuses on the two classes of programming abstractions supported in MultiFlex. The first is a set of Parallel programming Patterns (PPP) which offer a rich set of programming abstractions for implementing efficient data-and task-level parallel applications. The second is a Reactive Task Management (RTM) abstraction, which offers a lightweight C-based API to support dynamic dispatching of small grain tasks on tightly coupled parallel processing resources. The use of the MultiFlex native programming model is illustrated through the capture and mapping of two representative video applications. The first is a high-quality rescaling (HQR) application on a multiprocessor platform. We present the details of the optimization process which was required for mapping the HQR application, for which the reference code requires 350 GIPS (giga instructions per second), onto a 16 processor cluster. Our results show that the parallel implementation using the PPP model offers almost linear acceleration with respect to the number of processing elements. The second application is a high-definition VC-1 decoder. For this application, we illustrate two different parallel programming model variants, one using PPPs, the other based on RTM. These two versions are mapped onto two variants of a homogeneous version of the Platform 2012 multi-core fabric.

关键词： Performance Experimentation Languages programming models components multi-core platform mapping

来源：评论

学校读者我要写书评

暂无评论

A european perspective on supercomputing 09

A european perspective on supercomputing

引用

Proceedings of the 23rd international conference on Supercomputing

作者： Mateo Valero Barcelona Supercomputing Center - Centro Nacional de Supercomputación Barcelona Spain

ISBN: (纸本)9781605584980

Massive computing systems will be needed to maintain competitiveness in all areas of science, engineering and business, to provide both management efficiency and computing capability. From a systems management perspective, massive installations offer an efficient platform for resource sharing and service-oriented cloud computing; from a capability perspective, they allow unprecedented performance for supercomputing applications. With top supercomputing systems reaching the PetaFlop barrier, the next challenge is to devise technology to reach, and applications to take advantage of, ExaFlop performance. Multicore chips are already here but will grow in the next decade to several hundred cores. Although these chips will be used for general-purpose computing, they will be the tera-device components of future exascale *** is aware of the importance of having a well-structured supercomputing infrastructure, as well as the need to exchange experiences and know-how across the Union. Infrastructure projects such as the current DEISA (Distributed European Infrastructure for Supercomputing Applications) or the future PRACE (Partnership for Advanced Computing in Europe) aim at setting up such coordinated resources. PRACE, in particular, will create a world-class pan-European high performance computing service and infrastructure, managed as a single European entity. The service will include five superior supercomputing centers, strengthened by regional and national centers, working in collaboration through grid technologies. The BSC-CNS is the Spanish representative, one of five principal partners in the project (the others being Germany, France, UK and the Netherlands). The principal partner countries have agreed to contribute more to the PRACE budget, and to host the tier-0 machines which will form part of the distributed *** hit the power wall, the computing market is now undergoing a shaky era of dispersion, where many kinds of multicore alternat

关键词： programming models supercomputing infrastructure exascale computing supercomputing grids exaflop multicore architectures

来源：评论

学校读者我要写书评

暂无评论

An envolutionary path towards virtual shared memory with random access 06

An envolutionary path towards virtual shared memory with ran...

引用

Proceedings of the eighteenth annual ACM symposium on Parallelism in algorithms and architectures

作者： Jonathan L. Brown Sue Goudy Mike Heroux Shan Shan Huang Zhaofang Wen Sandia National Laboratories Albuquerque NM

No abstract available.

ISBN: (纸本)9781595934529

No abstract available.

关键词： parallel programming programming models

来源：评论

学校读者我要写书评

暂无评论

OPTIMAL RESERVOIR MANAGEMENT AND CROP PLANNING UNDER DETERMINISTIC AND STOCHASTIC INFLOWS

引用

JAWRA Journal of the American Water Resources Association 1980年第3期16卷 438-443页

作者： Maji, C.C. Heady, Earl O. Division of Agricultural Economics Indian Agricultural Research Institute New Delhi India Center for Agricultural and Rural Development Iowa State University Ames Iowa 50011 United States

ABSTRACT: This study analyzes planning under deterministic and stochastic inflows for the Mayurakshi project in India. models are developed to indicate the optimal storage of reservoir water, the transfer of water to the producing regions, and the spillage of water from the reservoir, if needed. A deterministic programming model was first formulated to represent the existing situation. A chance‐constrained model then was constructed to evaluate potential violations of the deterministic model. Both models were quantified for the command area. Data were collected from surveys of the area and from government agencies. Both the deterministic and change‐constrained models suggest a more intensive cropping program in the region. Both emphasize more dependence on rabi and less on kharif crops. The chance‐constrained especially suggests use of more water in the rabi season. Important chances in cropping programs and labor use take place under the chance‐constrained model. Copyright © 1980, Wiley Blackwell. All rights reserved

关键词： deterministic and stochastic flows programming models reservoir management water allocation

来源：评论

学校读者我要写书评

暂无评论

OdinMP/CCp—a portable implementation of OpenMP for C

引用

Concurrency and Computation: Practice and Experience 2000年第12期12卷

作者： Christian Brunschen Mats Brorsson Department of Information Technology Lund University P.O. Box 118 SE 22100 Lund Sweden

We describe here the design and performance of OdinMP/CCp, which is a portable compiler for C-programs using the OpenMP directives for parallel processing with shared memory. OdinMP/CCp was written in Java for portability reasons and takes a C-program with OpenMP directives and produces a C-program for POSIX threads. We describe some of the ideas behind the design of OdinMP/CCp and show some performanceresults achieved on an SGI Origin 2000 and a Sun E10000. Speedup measurements relative to a sequential version of the test programs show that OpenMP programs using OdinMP/CCp exhibit excellent performance on the Sun E10000 and reasonable performance on the Origin 2000. Copyright © 2000 John Wiley & Sons, Ltd.

关键词： OpenMP shared memory programming models multiprocessors

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：