检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

7 篇 会议
1 篇 期刊文献

馆藏范围

8 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

7 篇 工学
- 7 篇 计算机科学与技术...
- 6 篇 软件工程
- 1 篇 电气工程
- 1 篇 控制科学与工程
- 1 篇 石油与天然气工程

主题

2 篇 openacc
2 篇 application prog...
1 篇 particle in cell
1 篇 heterogeneous pr...
1 篇 collaborative co...
1 篇 massively parall...
1 篇 gpgpu
1 篇 parallel program...
1 篇 standard languag...
1 篇 accelerated comp...
1 篇 heterogeneous co...
1 篇 fusion plasma
1 篇 opencl
1 篇 mobile cloud com...
1 篇 android
1 篇 programming mode...
1 篇 openmp
1 篇 directives
1 篇 do concurrent
1 篇 gpu

机构

1 篇 barcelona superc...
1 篇 princeton univ p...
1 篇 univ calif irvin...
1 篇 csic artificial ...
1 篇 ibm corp ny usa
1 篇 univ tennessee i...
1 篇 oak ridge natl l...
1 篇 upc dept comp ar...
1 篇 ibm canada markh...
1 篇 oak ridge natl l...
1 篇 intel corp 3600 ...
1 篇 rice univ housto...
1 篇 uiuc coordinated...
1 篇 predict sci inc ...
1 篇 oak ridge natl l...
1 篇 nvidia santa cla...
1 篇 chinese acad sci...

作者

2 篇 joubert wayne
1 篇 dongarra jack
1 篇 haidar azzam
1 篇 taimourzadeh sam
1 篇 lin zhihong
1 篇 badia rosa m.
1 篇 tiotto ettore
1 篇 hwu wen-mei
1 篇 wang bei
1 篇 tomov stanimire
1 篇 tang william
1 篇 stulajter miko m...
1 篇 duran a.
1 篇 hernandez oscar
1 篇 linker jon a.
1 篇 sewall j. d.
1 篇 ho robert
1 篇 lordan francesc
1 篇 hayashi akihiro
1 篇 narayanaswamy r.

语言

8 篇 英文

检索条件"任意字段=3rd Workshop on Accelerator Programming using Directives, WACCPD 2016"

共 8 条记录，以下是1-10 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Proceedings of waccpd 2016: 3rd workshop on accelerator programming using directives - Held in conjunction with SC 2016: The International Conference for High Performance Computing, Networking, Storage and Analysis

Proceedings of WACCPD 2016: 3rd Workshop on Accelerator Prog...

引用

3rd workshop on accelerator programming using directives, waccpd 2016

ISBN: (纸本)9781509061525

The proceedings contain 8 papers. The topics discussed include: acceleration of element-by-element kernel in unstructured implicit low-order finite-element earthquake simulation using OpenACC on Pascal GPUs;towards achieving performance portability using directives for accelerators;a modern memory management system for OpenMP;an extension of OpenACC directives for out-of-core stencil computation with temporal blocking;OpenACC cache directive: opportunities and optimizations;identifying and scheduling loop chains using directives;exploring compiler optimization opportunities for the OpenMP 4.× accelerator model on a POWER8+GPU platform;and a portable, high-level graph analytics framework targeting distributed, heterogeneous systems.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Can Fortran's 'do concurrent' Replace directives for Accelerated Computing? 8th

Can Fortran's 'do concurrent' Replace Directives for Acceler...

引用

8th International workshop on accelerator programming using directives (waccpd)

作者： Stulajter, Miko M. Caplan, Ronald M. Linker, Jon A. Predict Sci Inc 9990 Mesa Rim RdSuite 170 San Diego CA 92121 USA

ISBN: (纸本)9783030977597;9783030977580

Recently, there has been growing interest in using standard language constructs (e.g. C++'s Parallel Algorithms and Fortran's do concurrent) for accelerated computing as an alternative to directive-based APIs (e.g. OpenMP and OpenACC). These constructs have the potential to be more portable, and some compilers already (or have plans to) support such standards. Here, we look at the current capabilities, portability, and performance of replacing directives with Fortran's do concurrent using a mini-app that currently implements OpenACC for GPU-acceleration and OpenMP for multi-core CPU parallelism. We replace as many directives as possible with do concurrent, testing various configurations and compiler options within three major compilers: GNU's gfortran, NVIDIA's nvfortran, and Intel's ifort. We find that with the right compiler versions and flags, many directives can be replaced without loss of performance or portability, and, in the case of nvfortran, they can all be replaced. We discuss limitations that may apply to more complicated codes and future language additions that may mitigate them. The software and Singularity/Apptainer containers are publicly provided to allow the results to be reproduced.

关键词： accelerated computing OpenMP OpenACC do concurrent standard language parallelism

来源：评论

学校读者我要写书评

暂无评论

Heterogeneous programming and Optimization of Gyrokinetic Toroidal Code using directives 5th

Heterogeneous Programming and Optimization of Gyrokinetic To...

引用

5th International workshop on accelerator programming using directives (waccpd)

作者： Zhang, Wenlu Joubert, Wayne Wang, Peng Wang, Bei Tang, William Niemerg, Matthew Shi, Lei Taimourzadeh, Sam Bao, Jian Lin, Zhihong Univ Calif Irvine Dept Phys & Astron Irvine CA 92697 USA Chinese Acad Sci Inst Phys Beijing Peoples R China Oak Ridge Natl Lab Oak Ridge TN USA NVidia Santa Clara CA USA Princeton Univ Princeton NJ 08544 USA IBM Corp New York NY USA

ISBN: (纸本)9783030122744;9783030122737

The latest production version of the fusion particle simulation code, Gyrokinetic Toroidal Code (GTC), has been ported to and optimized for the next generation exascale GPU supercomputing platform. Heterogeneous programming using directives has been utilized to balance the continuously implemented physical capabilities and rapidly evolving software/hardware systems. The original code has been refactored to a set of unified functions/calls to enable the acceleration for all the species of particles. Extensive GPU optimization has been performed on GTC to boost the performance of the particle push and shift operations. In order to identify the hotspots, the code was the first benchmarked on up to 8000 nodes of the Titan supercomputer, which shows about 2-3 times overall speedup comparing NVidia M2050 GPUs to Intel Xeon X5670 CPUs. This Phase I optimization was followed by further optimizations in Phase II, where single-node tests show an overall speedup of about 34 times on SummitDev and 7.9 times on Titan. The real physics tests on Summit machine showed impressive scaling properties that reaches roughly 50% efficiency on 928 nodes of Summit. The GPU + CPU speed up from purely CPU is over 20 times, leading to an unprecedented speed.

关键词： Massively parallel computing Heterogeneous programming directives GPU OpenACC Fusion plasma Particle in cell

来源：评论

学校读者我要写书评

暂无评论

Enabling GPU Support for the COMPSs-Mobile Framework 4th

Enabling GPU Support for the COMPSs-Mobile Framework

引用

4th International workshop on accelerator programming using directives (waccpd)

作者： Lordan, Francesc Badia, Rosa M. Hwu, Wen-Mei Barcelona Supercomp Ctr BSC CNS Dept Comp Sci Barcelona Spain UPC Dept Comp Architecture Barcelona Spain CSIC Artificial Intelligence Res Inst Barcelona Spain UIUC Coordinated Sci Lab Urbana IL USA

ISBN: (纸本)9783319748962;9783319748955

using the GPUs embedded in mobile devices allows for increasing the performance of the applications running on them while reducing the energy consumption of their execution. This article presents a task-based solution for adaptative, collaborative heterogeneous computing on mobile cloud environments. To implement our proposal, we extend the COMPSs-Mobile framework - an implementation of the COMPSs programming model for building mobile applications that offload part of the computation to the Cloud - to support offloading computation to GPUs through OpenCL. To evaluate our solution, we subject the prototype to three benchmark applications representing different application patterns.

关键词： programming model Heterogeneous computing Collaborative computing GPGPU OpenCL Mobile cloud computing Android

来源：评论

学校读者我要写书评

暂无评论

A Modern Memory Management System for OpenMP 3

A Modern Memory Management System for OpenMP

引用

3rd workshop on accelerator programming using directives (waccpd)

作者： Sewall, J. D. Pennycook, S. J. Duran, A. Tian, X. Narayanaswamy, R. Intel Corp 3600 Mission Coll Blvd Santa Clara CA 95050 USA

ISBN: (纸本)9781509061525

Modern computers with multi-/many-core processors and accelerators feature a sophisticated and deep memory hierarchy, potentially including distinct main memory, high-bandwidth memory, texture memory and scratchpad memory. The performance characteristics of these memories are varied, and studies have demonstrated the importance of using them effectively. In this paper, we propose an extension of the OpenMP API to address the needs of programmers to efficiently optimize their applications to use new memory technologies in a platform-agnostic and portable fashion. Our proposal separately exposes the characteristics of memory resources (such as kind) and the characteristics of allocations (such as alignment), and is fully compatible with existing OpenMP constructs.

关键词： Application programming interfaces (API)

来源：评论

学校读者我要写书评

暂无评论

Exploring Compiler Optimization Opportunities for the OpenMP 4.x accelerator Model on a POWER8+GPU Platform 3

Exploring Compiler Optimization Opportunities for the OpenMP...

引用

3rd workshop on accelerator programming using directives (waccpd)

作者： Hayashi, Akihiro Shirako, Jun Tiotto, Ettore Ho, Robert Sarkar, Vivek Rice Univ Houston TX 77251 USA IBM Canada Markham ON Canada

ISBN: (纸本)9781509061525

While GPUs are increasingly popular for highperformance computing, optimizing the performance of GPU programs is a time-consuming and non-trivial process in general. This complexity stems from the low abstraction level of standard GPU programming models such as CUDA and OpenCL: programmers are required to orchestrate low-level operations in order to exploit the full capability of GPUs. In terms of software productivity and portability, a more attractive approach would be to facilitate GPU programming by providing high-level abstractions for expressing parallel algorithms. OpenMP is a directive-based shared memory parallel programming model and has been widely used for many years. From OpenMP 4.0 onwards, GPU platforms are supported by extending OpenMP's high-level parallel abstractions with accelerator programming. This extension allows programmers to write GPU programs in standard C/C++ or Fortran languages, without exposing too many details of GPU architectures. However, such high-level parallel programming strategies generally impose additional program optimizations on compilers, which could result in lower performance than fully hand-tuned code with low-level programming models. To study potential performance improvements by compiling and optimizing high-level GPU programs, in this paper, we 1) evaluate a set of OpenMP 4.x benchmarks on an IBM POWER8 and NVIDIA Tesla GPU platform and 2) conduct a comparable performance analysis among hand-written CUDA and automatically-generated GPU programs by the IBM XL and clang/LLVM compilers.

关键词： Parallel programming

来源：评论

学校读者我要写书评

暂无评论

Towards Achieving Performance Portability using directives for accelerators 3

Towards Achieving Performance Portability Using Directives f...

引用

3rd workshop on accelerator programming using directives (waccpd)

作者： Lopez, M. Graham Larrea, Veronica Vergara Joubert, Wayne Hernandez, Oscar Haidar, Azzam Tomov, Stanimire Dongarra, Jack Oak Ridge Natl Lab Comp Sci & Math Div Oak Ridge TN 37830 USA Oak Ridge Natl Lab Natl Ctr Computat Sci Oak Ridge TN 37830 USA Univ Tennessee Innovat Comp Lab Knoxville TN USA

ISBN: (纸本)9781509061525

In this paper we explore the performance portability of directives provided by OpenMP 4 and OpenACC to program various types of node architectures with attached accelerators, both self-hosted multicore and offload multicore/GPU. Our goal is to examine how successful OpenACC and the newer offload features of OpenMP 4.5 are for moving codes between architectures, how much tuning might be required and what lessons we can learn from this experience. To do this, we use examples of algorithms with varying computational intensities for our evaluation, as both compute and data access efficiency are important considerations for overall application performance. We implement these kernels using various methods provided by newer OpenACC and OpenMP implementations, and we evaluate their performance on various platforms including both X86_64 with attached NVIDIA GPUs, self-hosted Intel Xeon Phi KNL, as well as an X86_64 host system with Intel Xeon Phi coprocessors. In this paper, we explain what factors affected the performance portability such as how to pick the right programming model, its programming style, its availability on different platforms, and how well compilers can optimize and target to multiple platforms.

关键词： Application programming interfaces (API)

来源：评论

学校读者我要写书评

暂无评论

2016: Third workshop on accelerator programming using directives (waccpd)

Proceedings of WACCPD 2016: 3rd Workshop on Accelerator Prog...

引用

Proceedings of waccpd 2016: 3rd workshop on accelerator programming using directives - Held in conjunction with SC 2016: The International Conference for High Performance Computing, Networking, Storage and Analysis 2016年 iv-v页

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共1页 << < 1 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：