咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Reviewing GPU architectures to... 收藏

Reviewing GPU architectures to build efficient back projection for parallel geometries

考察 GPU 体系结构为平行几何学造有效的背设计

作     者:Chilingaryan, Suren Ametova, Evelina Kopmann, Anreas Mirone, Alessandro 

作者机构:Karlsruhe Inst Technol Karlsruhe Germany Katholieke Univ Leuven Leuven Belgium Univ Manchester Manchester Lancs England ESRF Data Anal Unit Grenoble France Karlsruhe Inst Technol Inst Data Proc & Elect Data Proc Grp Karlsruhe Germany 

出 版 物:《JOURNAL OF REAL-TIME IMAGE PROCESSING》 (实时图像处理杂志)

年 卷 期:2020年第17卷第5期

页      面:1331-1373页

核心收录:

学科分类:0808[工学-电气工程] 1002[医学-临床医学] 08[工学] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

基  金:FWO-SBO, (S004217N) German-Russian BMBF, (05K10CKB, 05K10VKE) Engineering and Physical Sciences Research Council, EPSRC, (EP/P02226X/1) 

主  题:Parallel algorithms Hardware architecture GPU computing Synchrotron tomography Back-projection CUDA OpenCL 

摘      要:Back-Projection is the major algorithm in Computed Tomography to reconstruct images from a set of recorded projections. It is used for both fast analytical methods and high-quality iterative techniques. X-ray imaging facilities rely on Back-Projection to reconstruct internal structures in material samples and living organisms with high spatial and temporal resolution. Fast image reconstruction is also essential to track and control processes under study in real-time. In this article, we present efficient implementations of the Back-Projection algorithm for parallel hardware. We survey a range of parallel architectures presented by the major hardware vendors during the last 10 years. Similarities and differences between these architectures are analyzed and we highlight how specific features can be used to enhance the reconstruction performance. In particular, we build a performance model to find hardware hotspots and propose several optimizations to balance the load between texture engine, computational and special function units, as well as different types of memory maximizing the utilization of all GPU subsystems in parallel. We further show that targeting architecture-specific features allows one to boost the performance 2-7 times compared to the current state-of-the-art algorithms used in standard reconstructions codes. The suggested load-balancing approach is not limited to the back-projection but can be used as a general optimization strategy for implementing parallel algorithms.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分