检索结果-内蒙古大学图书馆

Improving the execution efficiency of barrier synchronization in software DSM through static analysis

INTERNATIONAL JOURNAL OF HIGH SPEED COMPUTING 2000年第3期11卷 167-188页

作者： Lee, JB Jhon, CS Pusan Natl Univ Res Inst Comp Informat & Commun Pusan 609735 South Korea Seoul Natl Univ Sch Comp Sci & Engn Seoul 151742 South Korea

In software distributed shared memory (SDSM) systems, the large coherence granularity imposed by virtual memory page size tends to induce false sharing, which may lead to heavy network traffic or useless page misses on barrier operations. In this paper, we propose a method to alleviate the coherence overhead of barrier synchronization in the SDSM systems. It performs static analysis on a shared-memory program to examine data dependency between processors across global barriers, and then special primitives are inserted into the program in order to exploit the dependency information at run time. If the data modified before a barrier will be accessed by some of the other processors after the barrier, coherence messages are transferred only to the processors through the inserted primitives. Furthermore, if the modified data will not be used by any other processors, the primitives enforce the coherence messages to be delivered only to master process after the parallel execution of the program completes. We implemented the static analysis with SUIF parallelizing compiler and then evaluated the execution performance of modified programs in a 16-node SDSM system supporting AURC protocol. The experimental results show that our method is very effective at reducing the useless coherence messages, and also can improve the execution time substantially by reducing false sharing misses.

关键词： software distributed shared memory data coherence relaxed memory models Automatic Update Release Consistency data dependency analysis run-time support

来源：评论

学校读者我要写书评

暂无评论

OpenMP for networks of SMPs

引用

JOURNAL OF PARALLEL AND distributed COMPUTING 2000年第12期60卷 1512-1530页

作者： Hu, YC Lu, HH Cox, AL Zwaenepoel, W Rice Univ Dept Comp Sci Houston TX 77005 USA Rice Univ Dept Elect & Comp Engn Houston TX 77005 USA

In this paper, we present the first system that implements OpenMP on a network of shared-memory multiprocessors. This system enables the programmer to rely on a single, standard, shared-memory API for parallelization within a multiprocessor and between multiprocessors. It is implemented via a translator that converts OpenMP directives to appropriate calls to a modified version of the TreadMarks software distributed shared-memory (SDSM) system. In contrast to previous SDSM systems for SMPs, the modified TreadMarks system uses POSIX threads for parallelism within an SMP node. This approach greatly simplifies the changes required to the SDSM in order to exploit the intranode hardware shared memory. We present performance results for seven applications (Barnes-Hut, CLU, and Water from SPLASH-2, 3D-FFT from NAS, Red-Black SOR, TSP, and MGS) running on an SP2 with four four-processor SMP nodes. A comparison between the thread implementation and the original implementation of TreadMarks shows that using the hardware shared memory within an SMP node significantly reduces the amount of data and the number of messages transmitted between nodes and consequently achieves speedups that are up to 30% better than the original versions. We also compare SDSM against message passing. Overall, the speedups or multithreaded TreadMarks programs are within 7-30% of the MPI versions. (C) 2000 Academic Press.

关键词： OpenMP shared memory programming POSIX threads networks of SMPs software distributed shared memory

来源：评论

学校读者我要写书评

暂无评论

Reducing coherence overhead of barrier synchronization in software DSMs 98

Reducing coherence overhead of barrier synchronization in so...

引用

Proceedings of the 1998 ACM/IEEE conference on Supercomputing

作者： Jae Bum Lee Chu Shik Jhon Seoul National University Seoul 151-742 Korea

ISBN: (纸本)9780897919845

software distributed shared memory (SDSM)systems usually have the large coherence granularity that is imposed by the underlying virtual memory page size. To alleviate the coherence overheads such as the net worktraffic to preserve the coherence, or page misses caused by false sharing, relaxed memory models are widely accepted for the SDSM systems. In the relaxed memory models, when a shared page is modified, in validation requests to other copies are deferred until a synchronization point and, in addition, the requests are transferred only to the processor acquiring the synchronization variable. On a barrier, however, the invalidation requests must be transferred to all the processors that participate in the barrier. As a result, it tends to induce heavy network traffic, and also may lead to useless page misses by false *** this paper, we propose a method to alleviate the coherence overheads of barrier synchronization in shared-memory parallel programs. It performs static analysis to examine data dependency between processors across global barriers, and then inserts special primitives into the program in order to exploit the dependency information at run time. The static analysis finds out coderegions where a processor modifies data that will be used only by some of the other processors. At run time, the coherence messages for the data are transferred only to the processors with the help of the inserted primitives. In particular, if the modified data will not be used by any other processors, the primitives enforce that the coherence messages are delivered only to master processor when the parallel execution of the program is *** evaluated the performance of this method in a 16-node software DSM system supporting AURC protocol. Program-driven simulation was performed with five benchmark programs: Jacobi, Red-black SOR, Expl, LU, and Water-nsquared. For the applications, the experimental results show that our method can reduce the coherence messages by

关键词： software distributed shared memory run-time support automatic-update relaxed memory models data dependency analysis data coherence

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：