检索结果-内蒙古大学图书馆

P-SOCRATES: A parallel software framework for time-critical many-core systems

MICROPROCESSORS AND MICROSYSTEMS 2015年第8期39卷 1190-1203页

作者： Pinho, Luis Miguel Nelis, Vincent Yomsi, Patrick Meumeu Quinones, Eduardo Bertogna, Marko Burgio, Paolo Marongiu, Andrea Scordino, Claudio Gai, Paolo Ramponi, Michele Mardiak, Michal ISEP Oporto Portugal Barcelona Supercomp Ctr Dept Comp Sci Barcelona Spain Univ Modena I-41100 Modena Italy ETH Zurich Switzerland Evidence Srl Florence Italy Act Technol Srl Ferrara Italy

Current generation of computing platforms is embracing multi-core and many-core processors to improve the overall performance of the system, meeting at the same time the stringent energy budgets requested by the market. parallel programming languages are nowadays paramount to extracting the tremendous potential offered by these platforms: parallel computing is no longer a niche in the high performance computing (HPC) field, but an essential ingredient in all domains of computer science. The advent of next-generation many-core embedded platforms has the chance of intercepting a converging need for predictable high-performance coming from both the High-Performance Computing (HPC) and Embedded Computing (EC) domains. On one side, new kinds of HPC applications are being required by markets needing huge amounts of information to be processed within a bounded amount of time. On the other side, EC systems are increasingly concerned with providing higher performance in real-time, challenging the performance capabilities of current architectures. This converging demand raises the problem about how to guarantee timing requirements in presence of parallel execution. The paper presents how the time-criticality and parallelisation challenges are addressed by merging techniques coming from both HPC and EC domains, and provides an overview of the proposed framework to achieve these objectives. (c) 2015 Elsevier B.V. All rights reserved.

关键词： Many-core systems Real-time systems Embedded systems WCET analysis Real-time scheduling parallel programming models

来源：评论

学校读者我要写书评

暂无评论

PyCOMPSs: parallel computational workflows in Python

引用

INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS 2017年第1期31卷 66-82页

作者： Tejedor, Enric Becerra, Yolanda Alomar, Guillem Queralt, Anna Badia, Rosa M. Torres, Jordi Cortes, Toni Labarta, Jesus Barcelona Supercomp Ctr BSC CNS Dept Comp Sci Barcelona Spain Spanish Council Sci Res CSIC Artificial Intelligence Res Inst IIIA Barcelona Spain

The use of the Python programming language for scientific computing has been gaining momentum in the last years. The fact that it is compact and readable and its complete set of scientific libraries are two important characteristics that favour its adoption. Nevertheless, Python still lacks a solution for easily parallelizing generic scripts on distributed infrastructures, since the current alternatives mostly require the use of APIs for message passing or are restricted to embarrassingly parallel computations. In that sense, this paper presents PyCOMPSs, a framework that facilitates the development of parallel computational workflows in Python. In this approach, the user programs her script in a sequential fashion and decorates the functions to be run as asynchronous parallel tasks. A runtime system is in charge of exploiting the inherent concurrency of the script, detecting the data dependencies between tasks and spawning them to the available resources. Furthermore, we show how this programming model can be built on top of a Big Data storage architecture, where the data stored in the backend is abstracted and accessed from the application in the form of persistent objects.

关键词： Scientic computing parallel programming models Python Big Data storage

来源：评论

学校读者我要写书评

暂无评论

Easy Dataflow programming in Clusters with UPC plus plus DepSpawn

引用

IEEE TRANSACTIONS ON parallel AND DISTRIBUTED SYSTEMS 2019年第6期30卷 1267-1282页

作者： Fraguela, Basilio B. Andrade, Diego Univ A Coruna Fac Informat Grp Arquitectura Comp Campus Elvina S-N La Coruna 15071 Spain

The Partitioned Global Address Space (PGAS) programming model is one of the most relevant proposals to improve the ability of developers to exploit distributed memory systems. However, despite its important advantages with respect to the traditional message-passing paradigm, PGAS has not been yet widely adopted. We think that PGAS libraries are more promising than languages because they avoid the requirement to (re) write the applications using them, with the implied uncertainties related to portability and interoperability with the vast amount of APIs and libraries that exist for widespread languages. Nevertheless, the need to embed these libraries within a host language can limit their expressiveness and very useful features can be missing. This paper contributes to the advance of PGAS by enabling the simple development of arbitrarily complex task-parallel codes following a dataflow approach on top of the PGAS UPC++ library, implemented in C++. In addition, our proposal, called UPC++ DepSpawn, relies on an optimized multithreaded runtime that provides very competitive performance, as our experimental evaluation shows.

关键词： Libraries parallel programming models distributed memory multithreading programmability dataflow

来源：评论

学校读者我要写书评

暂无评论

POLYMORPHIC PROCESSOR ARRAYS

引用

IEEE TRANSACTIONS ON parallel AND DISTRIBUTED SYSTEMS 1993年第5期4卷 490-506页

作者： MARESCA, M INT COMP SCI INST I-16145 GENOAITALY

A Polymorphic Processor Array (PPA) is a two-dimensional mesh-connected array of processors, in which each processor is equipped with a switch able to interconnect its four NEWS ports. PPA is an abstract architecture based upon the experience acquired in the design and in the implementation of a VLSI chip, namely the Polymorphic Torus (PT) chip, and, as a consequence, it only includes capabilities that have been proved to be supported by cost-effective hardware structures. The main claims of PPA are that 1) it models a realistic class of parallel computers, 2) it supports the definition of high level programming models, 3) it supports virtual parallelism and 4) it supports low complexity algorithms in a number of application fields. In this paper we present both the PPA computation model and the PPA programming model;we show that the PPA computation model is realistic by relating it to the design of the PT chip and show that the PPA programming model is scalable by demonstrating that any algorithm having O(p) complexity on a virtual PPA of size square-root m x square-root m, has O(kp) complexity on a PPA of size square-root n x square-root n, with m = kn and k integer. We finally show some application algorithms in the area of numerical analysis and graph processing.

关键词： DATA parallel ALGORITHMS MASSIVELY parallel COMPUTERS MESH CONNECTED COMPUTERS parallel COMPUTATION models parallel programming models RECONFIGURABLE parallel COMPUTERS SIMD ARCHITECTURES

来源：评论

学校读者我要写书评

暂无评论

SkIE: A heterogeneous environment for HPC applications

引用

parallel COMPUTING 1999年第13-14期25卷 1827-1852页

作者： Bacci, B Danelutto, M Pelagatti, S Vanneschi, M Univ Pisa Dipartimento Informat I-56125 Pisa Italy Quadr Supercomp World Ltd I-56125 Pisa Italy

Technological directions for innovative HPC software environments are discussed in this paper. We focus on industrial user requirements of heterogeneous multidisciplinary applications, performance portability, rapid prototyping and software reuse, integration and interoperability of standard tools. The Various issues are demonstrated with reference to the PQE2000 project and its programming environment Skeleton-based Integrated Environment (SkIE), SkIE includes a coordination language, SkIECL, allowing the designers to express, in a primitive and structured way, efficient combinations of data parallelism and task parallelism. The goal is achieving fast development and good efficiency for applications in different areas. Modules developed with standard languages and tools are encapsulated into SkIECL structures to form the global application. Performance models associated to the coordination language allow powerful optimizations to be introduced both at run time and at compile time without the direct intervention of the programmer. The paper also discusses the features of the SkIE environment related to debugging, performance analysis tools, visualization and graphical user interface. A discussion of the results achieved in some applications developed using the environment concludes the paper. (C) 1999 Elsevier Science B.V. All rights reserved.

关键词： parallel programming environments parallel programming models structured parallel programming

来源：评论

学校读者我要写书评

暂无评论

Runtime Support for Multiple Offload-Based programming models on Clustered Manycore Accelerators

引用

IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING 2018年第3期6卷 330-342页

作者： Capotondi, Alessandro Marongiu, Andrea Benini, Luca Univ Bologna Dept Elect Elect & Informat Engn Guglielmo Marcon I-40126 Bologna Italy Swiss Fed Inst Technol Swiss Fed Inst Technol Zurich Dept Informat Technol & Elect Engn CH-8092 Zurich Switzerland

Heterogeneous systems coupling a main host processor with one or more manycore accelerators are being adopted virtually at every scale to achieve ever-increasing GOps/Watt targets. The increased hardware complexity of such systems is paired at the application level by a growing number of applications concurrently running on the system. Techniques that enable efficient accelerator resources sharing, supporting multiple programming models will thus be increasingly important for future heterogeneous SoCs. In this paper we present a runtime system for a cluster-based manycore accelerator, optimized for the concurrent execution of offloaded computation kernels from different programming models. The runtime supports spatial partitioning, where clusters can be grouped into several virtual accelerator instances. Our runtime design is modular and relies on a generic component for resource (cluster) scheduling, plus specialized components which deploy generic offload requests into the target programming model semantics. We evaluate the proposed runtime system on two real heterogeneous systems, focusing on two concrete use cases: i) single-user, multi-application high-end embedded systems and ii) multi-user, multi-workload low-power microservers. In the first case, our approach achieves 93 percent efficiency in terms of available accelerator resource exploitation. In the second case, our support allows 47 percent performance improvement compared to single-programming model systems.

关键词： parallel programming models heterogeneous computing clustered manycores OpenMP OpenCL

来源：评论

学校读者我要写书评

暂无评论

Employing nested OpenMP for the parallelization of multi-zone computational fluid dynamics applications

引用

JOURNAL OF parallel AND DISTRIBUTED COMPUTING 2006年第5期66卷 686-697页

作者： Ayguade, E Gonzalez, M Martorell, X Jost, G UPC Ctr Europeu Parallelisme Barcelona Barcelona 08034 Spain NASA Ames Res Ctr NAS Div Moffett Field CA 94035 USA

In this paper we describe the parallelization of the multi-zone code versions of the NAS parallel Benchmarks employing multi-level OpenMP parallelism. For our study, we use the NanosCompiler that supports nesting of OpenMP directives and provides clauses to control the grouping of threads, load balancing, and synchronization. We report the benchmark results, compare the timings with those of different hybrid parallelization paradigms (MPI+OpenMP and PLP) and discuss OpenMP implementation issues that affect the performance of multi-level parallel applications. (c) 2005 Elsevier Inc. All rights reserved.

关键词： OpenMP parallel programming models nested parallelism NAS benchmarks

来源：评论

学校读者我要写书评

暂无评论

A software stack for next-generation automotive systems on many-core heterogeneous platforms

引用

MICROPROCESSORS AND MICROSYSTEMS 2017年第Jul.期52卷 299-311页

作者： Burgio, Paolo Bertogna, Marko Capodieci, Nicola Cavicchioli, Roberto Sojka, Michal Houdek, Premysl Marongiu, Andrea Gai, Paolo Scordino, Claudio Morelli, Bruno Univ Modena & Reggio Emilia Modena Italy Czech Tech Univ Prague Czech Republic Swiss Fed Inst Technol Zurich Switzerland Evidence Srl Pisa Italy

The next-generation of partially and fully autonomous cars will be powered by embedded many-core platforms. Technologies for Advanced Driver Assistance Systems (ADAS) need to process an unprecedented amount of data within tight power budgets, making those platform the ideal candidate architecture. Integrating tens-to-hundreds of computing elements that run at lower frequencies allows obtaining impressive performance capabilities at a reduced power consumption, that meets the size, weight and power (SWaP) budget of automotive systems. Unfortunately, the inherent architectural complexity of many-core platforms makes it almost impossible to derive real-time guarantees using "traditional" state-of-the-art techniques, ultimately preventing their adoption in real industrial settings. Having impressive average performances with no guaranteed bounds on the response times of the critical computing activities is of little if no use in safety-critical applications. Project Hercules will address this issue, and provide the required technological infrastructure to exploit the tremendous potential of embedded many-cores for the next generation of automotive systems. This work gives an overview of the integrated Hercules software framework, which allows achieving an order-of-magnitude of predictable performance on top of cutting edge Commercial-Off-The-Shelf components (COTS). The proposed software stack will let both real-time and non real-time application coexist on next-generation, power-efficient embedded platforms, with preserved timing guarantees. (C) 2017 Elsevier B.V. All rights reserved.

关键词： Autonomous Driving Assistance Systems Many-core embedded systems Predictable Execution models Real-time systems parallel programming models

来源：评论

学校读者我要写书评

暂无评论

High-level programming of massively parallel computers based on shared virtual memory

引用

parallel COMPUTING 1998年第3-4期24卷 383-400页

作者： Gerndt, M Res Ctr Julich Cent Inst Appl Math D-52425 Julich Germany

Highly parallel machines needed to solve compute-intensive scientific applications are based on the distribution of physical memory across the compute nodes. The drawback of such systems is, the necessity to write applications in the message passing programming model. Therefore, a lot of research is going on in higher-level programming models and supportive hardware, operating system techniques, languages. The research direction outlined in this article is based on shared virtual memory systems, i.e., scalable parallel systems with a global address space which support an adaptive mapping of global addresses to physical memories. We introduce programming concepts and program optimizations for SVM systems in the context of the SVM-Fortran programming environment which is based on a shared virtual memory system implemented on Intel Paragon. The performance results for real applications proved that this environment enables users to obtain a similar or better performance than by progamming in HPF. (C) 1998 Elsevier Science B.V. All rights reserved.

关键词： distributed memory computers scientific computing shared virtual memory parallel programming models language constructs for data locality optimization performance analysis tools

来源：评论

学校读者我要写书评

暂无评论

SOFTWARE CHALLENGES FOR EXTREME SCALE COMPUTING: GOING FROM PETASCALE TO EXASCALE SYSTEMS

引用

INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS 2009年第4期23卷 437-439页

作者： Heroux, Michael A. Sandia Natl Labs Albuquerque NM 87185 USA

Preparing applications for a transition from petascale to exascale systems will require a very large investment in several areas of software research and development. The introduction of manycore nodes, the abundance of parallelism, an increase in system faults (including soft errors) and a complicated, multi-component software environment are some of the most challenging issues we face. In this paper we address four topics we believe to be the most the challenging issues and therefore the greatest opportunities for making effective next-generation scalable applications. First and foremost is the need to transform existing applications to run on manycore platforms and properly design new applications. This is particularly challenging in the absence of a standard, portable manycore programming environment, but we can make progress in this direction while manycore programming models are developed. Second is promoting advanced modeling and simulation capabilities such as embedded optimization and uncertainty quantification that lead to higher quality results and orders of magnitude more parallelism. Third is progress toward fault resilience in applications, a critical need as system reliability degrades. Fourth and finally is a qualitative improvement in software design, including the social aspects, as exascale software systems will be increasingly multi-team and multi-faceted efforts.

关键词： exascale computing parallel programming models advanced modeling and simulation fault resilient applications software engineering for computational science and engineering

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：