版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Univ Malaga Dept Comp Architecture Malaga 29080 Spain Univ Illinois Dept Comp Sci Digital Comp Lab 3318 Urbana IL 61801 USA
出 版 物:《IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS》 (IEEE Trans Parallel Distrib Syst)
年 卷 期:2003年第14卷第6期
页 面:545-562页
核心收录:
学科分类:0808[工学-电气工程] 08[工学] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:Ministry of Education of Spain, (TIC2000-1658) European Commission, EC, (1FD97-2103) European Commission, EC
主 题:parallelizing compiler locality analysis load balancing communication pattern mixed integer nonlinear programming
摘 要:This paper presents a new method that can be applied by a parallelizing compiler to find, without user intervention, the iteration and data decompositions that minimize communication and load imbalance overheads in parallel programs targeted at NUMA architectures. One of the key ingredients in our approach is the representation of locality as a Locality-Communication Graph (LCG) and the formulation of the compiler technique as a Mixed Integer Nonlinear Programming (MINLP) optimization problem on this graph. The objective function and constraints of the optimization problem model communication costs and load imbalance. The solution to this optimization problem is a decomposition that minimizes the parallel execution overhead. This paper summarizes the process of how the compiler extracts the locality information from a nonannotated code and focuses on how this compiler can derive the optimization problem, solve it, and generate the parallel code with the automatically selected iteration and data distributions. In addition, we include a discussion about our model and the solutions-the decompositions-that it provides. The approach presented in the paper is evaluated using several benchmarks. The experimental results demonstrate that the MINLP formulation does not increase compilation time significantly and that our framework generates very efficient iteration/data distributions for a variety of NUMA machines.