文献详情 >pocl: A Performance-Portable O... 收藏

pocl: A Performance-Portable OpenCL Implementation

pocl : 表演便携式计算机 OpenCL 实现

作者：Jaaskelainen, Pekka Sanchez de La Lama, Carlos Schnetter, Erik Raiskila, Kalle Takala, Jarmo Berg, Heikki

作者机构：Tampere Univ Technol FIN-33101 Tampere Finland Knowledge Dev POF Madrid Spain Perimeter Inst Theoret Phys Waterloo ON Canada Univ Guelph Dept Phys Guelph ON N1G 2W1 Canada Louisiana State Univ Ctr Computat & Technol Baton Rouge LA 70803 USA Nokia Res Ctr Espoo Finland

出版物：《INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING》 (国际并行程序设计杂志)

年卷期：2015年第43卷第5期

页面：752-785页

核心收录：

学科分类：08[工学] 0812[工学-计算机科学与技术（可授工学、理学学位）]

基　　金：Academy of Finland Finnish Funding Agency for Technology and Innovation (Project "Parallel Acceleration") [40115/13] ARTEMIS NSF [0905046, 0941653, 1212401] NSERC [2012-RGPIN-1505] Division Of Physics Direct For Mathematical & Physical Scien Funding Source: National Science Foundation Office of Advanced Cyberinfrastructure (OAC) Direct For Computer & Info Scie & Enginr Funding Source: National Science Foundation

主　　题：OpenCL LLVM GPGPU VLIW SIMD Parallel programming Heterogeneous platforms Performance portability

摘要：OpenCL is a standard for parallel programming of heterogeneous systems. The benefits of a common programming standard are clear;multiple vendors can provide support for application descriptions written according to the standard, thus reducing the program porting effort. While the standard brings the obvious benefits of platform portability, the performance portability aspects are largely left to the programmer. The situation is made worse due to multiple proprietary vendor implementations with different characteristics, and, thus, required optimization strategies. In this paper, we propose an OpenCL implementation that is both portable and performance portable. At its core is a kernel compiler that can be used to exploit the data parallelism of OpenCL programs on multiple platforms with different parallel hardware styles. The kernel compiler is modularized to perform target-independent parallel region formation separately from the target-specific parallel mapping of the regions to enable support for various styles of fine-grained parallel resources such as subword SIMD extensions, SIMD datapaths and static multi-issue. Unlike previous similar techniques that work on the source level, the parallel region formation retains the information of the data parallelism using the LLVM IR and its metadata infrastructure. This data can be exploited by the later generic compiler passes for efficient parallelization. The proposed open source implementation of OpenCL is also platform portable, enabling OpenCL on a wide range of architectures, both already commercialized and on those that are still under research. The paper describes how the portability of the implementation is achieved. We test the two aspects to portability by utilizing the kernel compiler and the OpenCL implementation to run OpenCL applications in various platforms with different style of parallel resources. The results show that most of the benchmarked applications when compiled using pocl were faster or close

本地馆藏 | 借阅须知 | 我要预约

已订购，未入库

sda

目录详情 | 试阅读 |

读者评论与其他读者分享你的观点

学校读者

用户名:未登录

我的评分

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

pocl: A Performance-Portable OpenCL Implementation

读者评论与其他读者分享你的观点

请选择收藏分类：

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

pocl: A Performance-Portable OpenCL Implementation

读者评论 与其他读者分享你的观点

请选择收藏分类： 新增自定义分类 确定 取消

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

读者评论与其他读者分享你的观点

请选择收藏分类：