检索结果-内蒙古大学图书馆

Predecessor/successor approach for high-performance run-time wavefront scheduling

INFORMATION SCIENCES 2006年第7期176卷 845-860页

作者： Huang, TC Hsu, PH Natl Sun Yat Sen Univ Dept Elect Engn Kaohsiung 804 Taiwan Cheng Shiu Inst Technol Dept Elect Engn Kaohsiung 833 Taiwan

Most scientific applications rely on parallel Multiprocessor computing to enhance Performance. However, the irregular loops within these applications obstruct the parallefism analysis at compile-time. Rauchwerger et al. presented a run-time method to extract the hidden parallelism in a program using dependence chains. The relative overhead degrades this approach's performance due to the mass storage requirement and huge array reference processing. In this Study, a new predecessor/successor approach is developed in which high-level predecessor/successor information is recorded and processed efficiently. A predecessor/successor table is constructed first in the inspector phase so that only the successor iterations in the current wavefront need to be examined, instead of the entire loop iterations during wavefront scheduling. Usually, the performance of dependence chain approach degrades dramatically for a hot-spot access pattern, but Our scheme works very efficiently in this case. The experimental results using synthetic code and real programs are presented to prove the superiority of the proposed approach. (c) 2005 Elsevier Inc. All rights reserved.

关键词： parallelizing compiler dependence chain loop parallelization inspector/executor wavefront scheduling

来源：评论

学校读者我要写书评

暂无评论

A communication scheme for the distributed execution of loop nests with while loops

引用

INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING 1995年第5期23卷 471-496页

作者： Griebl, M Lengauer, C Fakultät für Mathematik und Informatik Universität Passau Passau Germany

The mathematical model for the parallelization, or ''space-time mapping,'' of loop nests is the polyhedron model. The presence of while loops in the nest complicates matters for two reasons: (I) the parallelized loop nest does not correspond to a polyhedron but instead to a subset that resembles a (multidimensional) comb and (2) it is not clear when the entire loop nest has terminated. We describe a communication scheme which can deal with both problems and which can be added to the parallel target loop nest by a compiler.

关键词： loop parallelization parallelizing compilation space-time mapping while loop

来源：评论

学校读者我要写书评

暂无评论

ODCHP: a new effective mechanism to maximize parallelism of nested loops with non-uniform dependences

引用

JOURNAL OF SYSTEMS AND SOFTWARE 2001年第3期56卷 279-297页

作者： Pean, DL Chen, C Natl Chiao Tung Univ Dept Comp Sci & Informat Engn Hsinchu 30050 Taiwan

There are many methods for nested loop partitioning. However, most of them perform poorly when partitioning loops with nonuniform dependences. This paper proposes a generalized and optimized loop partitioning mechanism to exploit parallelism from nested loops with non-uniform dependences. Our approach, based on dependence convex theory, will divide the loop into variable size partitions. Furthermore, the proposed algorithm partitions a nested loop by using the copy-renaming and the optimized partitioning techniques to minimize the number of parallel regions of the iteration space. Consequently, it outperforms the previous partitioning mechanisms of nested loops with non-uniform dependences. Many optimization techniques are used to reduce the complexity of the algorithm. Compared with other popular techniques, our scheme shows a dramatic improvement in the preliminary performance results. (C) 2001 Elsevier Science Inc. All rights reserved.

关键词： compilers non-uniform dependence loop parallelization parallel compiler parallel processing dependence convex hull

来源：评论

学校读者我要写书评

暂无评论

Termination detection in parallel loop nests with while loops

引用

PARALLEL COMPUTING 1999年第12期25卷 1489-1510页

作者： Geigl, M Griebl, M Lengauer, C Univ Passau Fak Math & Informat D-94030 Passau Germany

One central problem in the execution of parallel nested loops with non-affine bounds is the precise scanning (i.e., enumeration) of the points in their iteration space and the detection of their termination. Scanning schemes have been proposed for both shared-memory and distributed-memory implementations. However, these schemes work only for perfectly nested while loops. We propose a scheme which also works for not perfectly nested while loops on shared memory. This scheme has been incorporated in our loop parallelizer loopo. (C) 1999 Elsevier Science B.V. All rights reserved.

关键词： loop parallelization for loops while loops termination detection non-speculative execution code generation

来源：评论

学校读者我要写书评

暂无评论

loop Parallelism Maximization for Multimedia Data Processing in Mobile Vehicular Clouds

引用

IEEE TRANSACTIONS ON CLOUD COMPUTING 2019年第1期7卷 250-258页

作者： Qiu, Meikang Dai, Wenyun Vasilakos, Athanasios V. Shenzhen Univ Coll Comp Sci & Software Engn Shenzhen 518060 Peoples R China Pace Univ Seidenberg Sch Comp Sci & Informat Syst New York NY 10038 USA Univ Western Macedonia Kozani 50100 Greece

Mobile vehicular cloud has become popular with the rapid development of cloud computing and mobile computing. Nested loops are usually the most critical part in multimedia and high performance Digital Signal Processing (DSP) systems which are widely used in vehicular applications and systems. In order to further explore the parallelism in nested loops, we study how to maximize the system performance with considering the energy reduction for applications on Chip Multiprocessor (CMP) architectures. We propose an algorithm Energy-Aware loop Parallelism Maximization (EALPM) to maximize the system performance with the consideration of energy reduction for applications with multidimensional nested loops. Our experiment shows that using the EALPM algorithm significantly improves both performance and energy consumption on average in comparision to other algorithms.

关键词： loop parallelization mobile vehicular cloud energy-aware nested loops multimedia data processing

来源：评论

学校读者我要写书评

暂无评论

A multi-dimensional version of the I test

引用

PARALLEL COMPUTING 2001年第13期27卷 1783-1799页

作者： Chang, WL Chu, CP Wu, JH Natl Cheng Kung Univ Dept Comp Sci & Informat Engn Tainan 701 Taiwan Kung Shan Inst Technol Dept Informat Management Tainan 701 Taiwan

Two-dimensional arrays with linear subscripts occur quite frequently in real programs. In general, for multi-dimensional linear arrays under constant bounds the Lambda test is an efficient data dependence method to check whether there exist real solutions. In this paper, we propose a multi-dimensional version of the I test, the multi-dimensional I test. that can be applied to testing whether there are integer solutions for multi-dimensional linear arrays under constant limits. Experiments with benchmark showing the effects of the multi-dimensional I test on testing precision and testing efficiency are also presented. (C) 2001 Elsevier Science B.V. All rights reserved.

关键词： parallelizing/vectorizing compilers data dependence analysis loop parallelization loop vectorization

来源：评论

学校读者我要写书评

暂无评论

Map Reduce inspired loop mapping for coarse-grained reconfigurable architecture

引用

Science China(Information Sciences) 2014年第12期57卷 184-197页

作者： YIN ShouYi SHAO ShengJia LIU LeiBo WEI ShaoJun Institute of Microelectronics Tsinghua University

Our work investigates how to map loops efficiently onto Coarse-Grained Reconfigurable Architecture(CGRA).This paper examines the properties of CGRA and builds Map Reduce inspired models for the loop parallelization *** proposed model has a more detailed performance metric and a more flexible unrolling scheme that can unroll different loop levels with different factors.A Geometric Programming based approach is proposed to resolve the optimization problem of loop parallelization *** proposed approach can find the optimal unrolling factor for each level loop,resulting in better parallelization of *** results show that the proposed approach achieved up to 44%performance gain compared to the state-of-the-art loop mapping scheme.

关键词： reconfigurable computing coarse-grained reconfigurable architecture(CGRA) application mapping loop parallelization Map Reduce

来源：评论

学校读者我要写书评

暂无评论

A precise dependence analysis for multi-dimensional arrays under specific dependence direction

引用

JOURNAL OF SYSTEMS AND SOFTWARE 2002年第2期63卷 99-112页

作者： Chang, WL Chu, CP Wu, JH So Taiwan Univ Technol Dept Informat Management Tainan 701 Taiwan Natl Cheng Kung Univ Dept Comp Sci & Informat Engn Tainan 701 Taiwan

In process of automatic parallelizing/vectorizing constant-bound loops with multi-dimensional arrays under specific dependence direction, the Lambda test is claimed to be an efficient and precise data dependence analysis method that can check whether there exist generally inexact 'real-valued' solutions to the derived dependence equations. In this paper, we propose a precise data dependence analysis method - the multi-dimensional direction vector I test. The multi-dimensional direction vector I test can be applied towards testing whether there exist generally accurate 'integer-valued' solutions to the dependence equations derived from multi-dimensional arrays under specific dependence direction in constant-bound loops. Experiments with benchmark showed that the accuracy rate and the improvement rate for the proposed method are approximately 33.3% and 21.6%, respectively. (C) 2001 Elsevier Science Inc. All rights reserved.

关键词： parallelizing/vectorizing compilers data dependence analysis loop parallelization supercomputing

来源：评论

学校读者我要写书评

暂无评论

Parallel loop generation and scheduling

引用

JOURNAL OF SUPERCOMPUTING 2009年第3期50卷 289-306页

作者： Lotfi, Shahriar Parsa, Saeed Univ Tabriz Dept Comp Sci Tabriz Iran Iran Univ Sci & Technol Fac Comp Engn Tehran Iran

loop tiling is an efficient loop transformation, mainly applied to detect coarse-grained parallelism in loops. It is a difficult task to apply n-dimensional non-rectangular tiles to generate parallel loops. This paper offers an efficient scheme to apply non-rectangular n-dimensional tiles in non-rectangular iteration spaces, to generate parallel loops. In order to exploit wavefront parallelism efficiently, all the tiles with equal sum of coordinates are assumed to reside on the same wavefront. Also, in order to assign parallelepiped tiles on each wavefront to different processors, an improved block scheduling strategy is offered in this paper.

关键词： loop parallelization Wave-front Code generation loop scheduling

来源：评论

学校读者我要写书评

暂无评论

The Accuracy of the Non-continuous I Test for One-Dimensional Arrays with References Created by Induction Variables

引用

JOURNAL OF INFORMATION PROCESSING SYSTEMS 2014年第4期10卷 523-542页

作者： Zhang, Qing Dalian Shipping Coll Econ & Technol Dev Zone Dalian 116052 Liaoning Peoples R China

One-dimensional arrays with subscripts formed by induction variables in real programs appear quite frequently. For most famous data dependence testing methods, checking if integer-valued solutions exist for one-dimensional arrays with references created by induction variable is very difficult. The I test, which is a refined combination of the GCD and Banerjee tests, is an efficient and precise data dependence testing technique to compute if integer-valued solutions exist for one-dimensional arrays with constant bounds and single increments. In this paper, the non-continuous I test, which is an extension of the I test, is proposed to figure out whether there are integer-valued solutions for one-dimensional arrays with constant bounds and non-sing ularincrements or not. Experiments with the benchmarks that have been cited from Livermore and Vector loop, reveal that there are definitive results for 67 pairs of one-dimensional arrays that were tested.

关键词： Data Dependence Analysis loop parallelization loop Vectorization parallelization/Vectorization Compilers

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：