咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >A Channel Memory based fault t... 收藏

A Channel Memory based fault tolerance for MPI applications

隧道记忆为 MPI 应用程序基于容错

作     者:Selikhov, A Germain, C 

作者机构:RAS SB ICMMG Supercomp Software Dept Novosibirsk 630090 Russia Univ Paris 11 LAL CNRS F-91898 Orsay France LRI PCRI F-91405 Orsay France 

出 版 物:《FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE》 (下代计算机系统)

年 卷 期:2005年第21卷第5期

页      面:709-715页

核心收录:

学科分类:08[工学] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

主  题:Channel Memory message passing interface fault tolerance Global Computing grid 

摘      要:Fault tolerant message passing environments protect parallel applications against node failures. Very large scale computing systems, ranging from large clusters to worldwide Global Computing systems, require a high level of fault tolerance in order to efficiently run parallel applications. The Channel Memory approach provides the infrastructure for scalable tolerance to simultaneous faults. Along with a specially designed checkpointing system and recovery protocol, this approach has resulted in the MPICH-V architecture. In this paper, we describe CMDE - a stand-alone distributed program system based on MPICH-V architecture and implementing an approach to tolerate faults of Channel Memories. (c) 2004 Elsevier B.V. All rights reserved.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分