咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Adaptive Cooperation of Prefet... 收藏

Adaptive Cooperation of Prefetching and Warp Scheduling on GPUs

在 GPU 上预取并且经纱安排的适应合作

作     者:Oh, Yunho Kim, Keunsoo Yoon, Myung Kuk Park, Jong Hyun Park, Yongjun Annavaram, Murali Ro, Won Woo 

作者机构:Yonsei Univ Sch Elect & Elect Engn Seoul 03722 South Korea Hanyang Univ Div Comp Sci & Engn Seoul 04763 South Korea Univ Southern Calif Ming Hsieh Dept Elect Engn Los Angeles CA 90007 USA 

出 版 物:《IEEE TRANSACTIONS ON COMPUTERS》 (IEEE计算机汇刊)

年 卷 期:2019年第68卷第4期

页      面:609-616页

核心收录:

学科分类:0808[工学-电气工程] 08[工学] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

基  金:National Research Foundation of Korea (NRF) - Korea government (MSIP) [NRF-2018R1A2A2A05018941] Technology Innovation Program - Ministry of Trade, Industry & Energy (MOTIE, Korea) Korea Semiconductor Research Consortium (KSRC) Graduate School of YONSEI University Research Scholarship Grants in 2017 

主  题:GPU cache warp scheduling data prefetching performance 

摘      要:This paper proposes a new architecture, called Adaptive PREfetching and Scheduling (APRES), which improves cache efficiency of GPUs. APRES relies on the observation that GPU loads tend to have either high locality or strided access patterns across warps. APRES schedules warps so that as many cache hits are generated as possible before the generation of any cache miss. Without directly predicting future cache hits/misses for each warp, APRES creates a warp group that will execute the same static load shortly and prioritizes the grouped warps. If the first executed warp in the group hits the cache, grouped warps are likely to access the same cache lines. Unless, APRES considers the load as a strided type and generates prefetch requests for the grouped warps. In addition, APRES includes a new dynamic L1 prefetch and data cache partitioning to reduce contentions between demand-fetched and prefetched lines. In our evaluation, APRES achieves 27.8 percent performance improvement.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分