检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

6 篇 会议
1 册 图书

馆藏范围

7 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

4 篇 工学
- 4 篇 计算机科学与技术...
- 3 篇 软件工程

主题

3 篇 performance port...
2 篇 openacc
2 篇 v100
2 篇 gpu
1 篇 emerging archite...
1 篇 mi50
1 篇 sparse solvers
1 篇 thunderx2
1 篇 cfd
1 篇 collaborative co...
1 篇 gpgpu
1 篇 control structur...
1 篇 directive-based ...
1 篇 openmp 4.5
1 篇 performance opti...
1 篇 heterogeneous sy...
1 篇 a100
1 篇 cuda
1 篇 directive based ...
1 篇 xeon phi

机构

1 篇 lawrence berkele...
1 篇 barcelona superc...
1 篇 csic artificial ...
1 篇 rwth aachen univ...
1 篇 lawrence berkele...
1 篇 michigan state u...
1 篇 university of de...
1 篇 oak ridge natl l...
1 篇 upc dept comp ar...
1 篇 old dominion uni...
1 篇 nasa langley res...
1 篇 helmholtz-zentru...
1 篇 uiuc coordinated...
1 篇 lawrence berkele...
1 篇 univ delaware ne...

作者

2 篇 wright nicholas ...
1 篇 daley christophe...
1 篇 sunita chandrase...
1 篇 daley christophe...
1 篇 pophale swaroop
1 篇 sridutt bhalacha...
1 篇 badia rosa m.
1 篇 hwu wen-mei
1 篇 huber thomas
1 篇 guido juckeland
1 篇 rabbi fazlay
1 篇 davis joshua hok...
1 篇 aktulga hasan me...
1 篇 sandra wienke
1 篇 lordan francesc
1 篇 zubair mohammad
1 篇 nielsen eric j.
1 篇 chandrasekaran s...
1 篇 walden aaron c.

语言

7 篇 英文

检索条件"任意字段=7th International Workshop on Accelerator Programming using Directives, WACCPD 2020"

共 7 条记录，以下是1-10 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

7th international workshop on accelerator programming using directives, waccpd 2020

7th International Workshop on Accelerator Programming using ...

引用

7th international workshop on accelerator programming using directives, waccpd 2020

ISBN: (纸本)9783030742232

the proceedings contain 5 papers. the special focus in this conference is on accelerator programming using directives. the topics include: GPU Acceleration of the FINE/FR CFD Solver in a Heterogeneous Environment with OpenACC directives;performance and Portability of a Linear Solver Across Emerging Architectures;ADELUS: A Performance-Portable Dense LU Solver for Distributed-Memory Hardware-Accelerated Systems.

关键词：

来源：评论

学校读者我要写书评

暂无评论

6th international workshop on accelerator programming using directives, waccpd 2019

6th International Workshop on Accelerator Programming Using ...

引用

6th international workshop on accelerator programming using directives, waccpd 2019

ISBN: (纸本)9783030499426

the proceedings contain 7 papers. the special focus in this conference is on accelerator programming using directives. the topics include: Accelerating the Performance of Modal Aerosol Module of E3SM using OpenACC;Evaluation of Directive-Based GPU programming Models on a Block Eigensolver with Consideration of Large Sparse Matrices;Performance of the RI-MP2 Fortran Kernel of GAMESS on GPUs via Directive-Based Offloading with Math Libraries;performance Portable Implementation of a Kinetic Plasma Simulation Mini-App;A Portable SIMD Primitive using Kokkos for Heterogeneous Architectures.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Performance Assessment of OpenMP Compilers Targeting NVIDIA V100 GPUs 7th

Performance Assessment of OpenMP Compilers Targeting NVIDIA ...

引用

7th international workshop on accelerator programming using directives (waccpd)

作者： Davis, Joshua Hoke Daley, Christopher Pophale, Swaroop Huber, thomas Chandrasekaran, Sunita Wright, Nicholas J. Univ Delaware Newark DE 19716 USA Lawrence Berkeley Natl Lab Natl Energy Res Sci Comp Ctr Berkeley CA 94720 USA Oak Ridge Natl Lab Oak Ridge TN 37830 USA

ISBN: (纸本)9783030742232;9783030742249

Heterogeneous systems are becoming increasingly prevalent. In order to exploit the rich compute resources of such systems, robust programming models are needed for application developers to seamlessly migrate legacy code from today's systems to tomorrow's. Over the past decade and more, directives have been established as one of the promising paths to tackle programmatic challenges on emerging systems. this work focuses on applying and demonstrating OpenMP offloading directives on five proxy applications. We observe that the performance varies widely from one compiler to the other;a crucial aspect of our work is reporting best practices to application developers who use OpenMP offloading compilers. While some issues can be worked around by the developer, there are other issues that must be reported to the compiler vendors. By restructuring OpenMP offloading directives, we gain an 18x speedup for the su3 proxy application on NERSC's Cori system when using the Clang compiler, and a 15.7x speedup by switching max reductions to add reductions in the laplace mini-app when using the Cray-llvm compiler on Cori.

关键词： Directive-based programming Performance portability Heterogeneous systems OpenMP GPU NVIDIA V100

来源：评论

学校读者我要写书评

暂无评论

Performance and Portability of a Linear Solver Across Emerging Architectures 7th

Performance and Portability of a Linear Solver Across Emergi...

引用

7th international workshop on accelerator programming using directives (waccpd)

作者： Walden, Aaron C. Zubair, Mohammad Nielsen, Eric J. NASA Langley Res Ctr Hampton VA 23665 USA Old Dominion Univ Norfolk VA USA

ISBN: (纸本)9783030742232;9783030742249

A linear solver algorithm used by a large-scale unstructured-grid computational fluid dynamics application is examined for a broad range of familiar and emerging architectures. Efficient implementation of a linear solver is challenging on recent CPUs offering vector architectures. Vector loads and stores are essential to effectively utilize available memory bandwidth on CPUs, and maintaining performance across different CPUs can be difficult in the face of varying vector lengths offered by each. A similar challenge occurs on GPU architectures, where it is essential to have coalesced memory accesses to utilize memory bandwidth effectively. In this work, we demonstrate that restructuring a computation, and possibly data layout, with regard to architecture is essential to achieve optimal performance by establishing a performance benchmark for each target architecture in a low level language such as vector intrinsics or CUDA. In doing so, we demonstrate how a linear solver kernel can be mapped to Intel((R)) Xeon (TM) and Xeon Phi (TM), Marvell((R)) thunderX2((R)), NEC (R) SX-Aurora (TM) TSUBASA Vector Engine, and NVIDIA((R)) and AMD((R)) GPUs. We further demonstrate that the required code restructuring can be achieved in higher level programming environments such as OpenACC, OCCA, and Intel((R)) OneAPI (TM)/SYCL, and that each generally results in optimal performance on the target architecture. Relative performance metrics for all implementations are shown, and subjective ratings for ease of implementation and optimization are suggested.

关键词： programming models Performance portability Emerging architecture CFD HPC CUDA OpenACC OCCA AVX-512 intrinsics Neon intrinsics Arm GPU V100 A100 MI50 Xeon Phi SX-Aurora thunderX2

来源：评论

学校读者我要写书评

暂无评论

Evaluation of Directive-Based GPU programming Models on a Block Eigensolver with Consideration of Large Sparse Matrices 6th

Evaluation of Directive-Based GPU Programming Models on a Bl...

引用

6th international workshop on accelerator programming using directives (waccpd)

作者： Rabbi, Fazlay Daley, Christopher S. Aktulga, Hasan Metin Wright, Nicholas J. Michigan State Univ E Lansing MI 48823 USA Lawrence Berkeley Natl Lab Berkeley CA 94720 USA

ISBN: (纸本)9783030499426;9783030499433

Achieving high performance and performance portability for large-scale scientific applications is a major challenge on heterogeneous computing systems such as many-core CPUs and accelerators like GPUs. In this work, we implement a widely used block eigensolver, Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG), using two popular directive based programming models (OpenMP and OpenACC) for GPU-accelerated systems. Our work differs from existing work in that it adopts a holistic approach that optimizes the full solver performance rather than narrowing the problem into small kernels (e.g., SpMM, SpMV). Our LOPBCG GPU implementation achieves a 2.8x-4.3x speedup over an optimized CPU implementation when tested with four different input matrices. the evaluated configuration compared one Skylake CPU to one Skylake CPU and one NVIDIA V100 GPU. Our OpenMP and OpenACC LOBPCG GPU implementations gave nearly identical performance. We also consider how to create an efficient LOBPCG solver that can solve problems larger than GPU memory capacity. To this end, we create microbenchmarks representing the two dominant kernels (inner product and SpMM kernel) in LOBPCG and then evaluate performance when using two different programming approaches: tiling the kernels, and using Unified Memory with the original kernels. Our tiled SpMM implementation achieves a 2.9x and 48.2x speedup over the Unified Memory implementation on supercomputers with PCIe Gen3 and NVLink 2.0 CPU to GPU interconnects, respectively.

关键词： Sparse solvers Performance optimization Performance portability Directive based programming models OpenMP 4.5 OpenACC

来源：评论

学校读者我要写书评

暂无评论

Enabling GPU Support for the COMPSs-Mobile Framework 4th

Enabling GPU Support for the COMPSs-Mobile Framework

引用

4th international workshop on accelerator programming using directives (waccpd)

作者： Lordan, Francesc Badia, Rosa M. Hwu, Wen-Mei Barcelona Supercomp Ctr BSC CNS Dept Comp Sci Barcelona Spain UPC Dept Comp Architecture Barcelona Spain CSIC Artificial Intelligence Res Inst Barcelona Spain UIUC Coordinated Sci Lab Urbana IL USA

ISBN: (纸本)9783319748962;9783319748955

using the GPUs embedded in mobile devices allows for increasing the performance of the applications running on them while reducing the energy consumption of their execution. this article presents a task-based solution for adaptative, collaborative heterogeneous computing on mobile cloud environments. To implement our proposal, we extend the COMPSs-Mobile framework - an implementation of the COMPSs programming model for building mobile applications that offload part of the computation to the Cloud - to support offloading computation to GPUs through OpenCL. To evaluate our solution, we subject the prototype to three benchmark applications representing different application patterns.

关键词： programming model Heterogeneous computing Collaborative computing GPGPU OpenCL Mobile cloud computing Android

来源：评论

学校读者我要写书评

暂无评论

accelerator programming using directives 1

引用

丛书名： Lecture Notes in Computer Science

1000年

作者： Sridutt Bhalachandra Sandra Wienke Sunita Chandrasekaran Guido Juckeland

this book constitutes the proceedings of the 7th international workshop on accelerator programming using directives, waccpd 2020, which took place on November 20, 2021. the workshop was initially planned to take ... 详细信息

ISBN: (数字)9783030742249

ISBN: (纸本)9783030742232

关键词： programming Languages, Compilers, Interpreters Computer Systems Organization and Communication Networks Control Structures and Microprogramming programming Techniques Numeric Computing

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共1页 << < 1 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：