检索结果-内蒙古大学图书馆

Consistent global checkpoints that contain a given set of local checkpoints

IEEE TRANSACTIONS ON COMPUTERS 1997年第4期46卷 456-468页

作者： Wang, YM AT&T Bell Labs. Murray Hill NJ USA Abstract Authors References Cited By Keywords Metrics Similar Download Citation Email Print Request Permissions

In this paper, we consider the problem of constructing consistent global checkpoints that contain a given set of checkpoints. We address three important issues related to this problem. First, we define the maximum and minimum consistent global checkpoints containing a set S, and give algorithms to construct them. These algorithms are based on reachability analysis on a rollback-dependency graph. Second, we introduce a concept called ''rollback-dependency trackability'' that enables this analysis to be performed efficiently for a certain class of checkpoint and communication models. We define the least stringent of these models (''FDAS''), and put it in context with other models defined in the literature. Significant in this is a way to use FDAS to provide efficient rollback recovery for applications that do not satisfy perfect piecewise determinism. Finally, we describe several applications of the theorems and algorithms derived in this paper to demonstrate the capability of our approach to unify, generalize, and extend many previous works.

关键词： algorithms distributed systems consistent global states distributed debugging deadlock recovery fault tolerance checkpointing rollback recovery message logging vector timestamps

来源：评论

学校读者我要写书评

暂无评论

Implementing record and refinement for debugging timing-dependent communication

引用

SCIENCE OF COMPUTER PROGRAMMING 2017年 134卷 4-18页

作者： Felgentreff, T. Perscheid, M. Hirschfeld, R. Univ Potsdam Software Architecture Grp Hasso Plattner Inst Prof Dr Helmert Str 2-3 Potsdam Germany

distributed applications are hard to debug because timing-dependent network communication is a source of non-deterministic behavior. Current approaches to debug non deterministic failures include post-mortem debugging as well as record and replay. However, the first impairs system performance to gather data, whereas the latter requires developers to understand the timing-dependent communication at a lower level of abstraction than they develop at. Furthermore, both approaches require intrusive core library modifications to gather data from live systems. In this paper, we present the Peek-At-Talk debugger for investigating non-deterministic failures with low overhead in a systematic, top-down method, with a particular focus on tool-building issues in the following areas: First, we show how our debugging framework Path Tools guides developers from failures to their root causes and gathers run-time data with low overhead. Second, we present Peek-At-Talk, an extension to our Path Tools framework to record non-deterministic communication and refine behavioral data that connects source code with network events. Finally, we scope changes to the core library to record network communication without impacting other network applications. (C) 2015 Elsevier B.V. All rights reserved.

关键词： distributed debugging Record and replay Dynamic analysis Record and refinement

来源：评论

学校读者我要写书评

暂无评论

Efficient online algorithm for identifying useless states in distributed systems

引用

distributed COMPUTING 2011年第5-6期23卷 359-372页

作者： Chen, Lung-Pin Sun, Der-Johng Chu, William Tunghai Univ Dept Comp Sci & Informat Engn Taichung 40704 Taiwan Natl Chiao Tung Univ Dept Comp Sci & Informat Engn Hsinchu 30050 Taiwan

In a distributed system, detecting whether a given logical predicate is true on the global states is fundamental for testing and debugging the program. Detecting predicates by examining all global states is intractable due to the combinatorial nature of the problem. This work designs an efficient online algorithm that identifies the consistent and useless states each time a new state is reported. This paper formulates the optimality of detecting algorithms in terms of pseudo states, which are employed to represent unknown states to the monitor process. Based on this technique, memory space of the debugger can be minimized by removing the useless states without affecting the debugging results. While minimizing memory space, the proposed algorithm requires only O(p (2) M) time in total, where p is the number of processes, and M is the number of reported states.

关键词： Checkpoint distributed debugging Global predicate detection

来源：评论

学校读者我要写书评

暂无评论

debugging MULTIAGENT SYSTEMS

引用

INFORMATION AND SOFTWARE TECHNOLOGY 1995年第2期37卷 103-112页

作者： VANLIEDEKERKE, MH AVOURIS, NM UNIV PATRAS DEPT ELECT ENGNGR-26500 PATRAIGREECE COMMISS EUROPEAN COMMUNITIES JOINT RES CTRINST ENVIRONMI-21020 ISPRAITALY

Development of multi-agent systems is a complex process during which use of adequate development tools is essential. This paper focuses on debugging of multi-agent systems. A survey of existing multi-agent systems development tools is presented first, Subsequently a new tool is introduced, based on the innovative approach of the developer's conceptual models, These are multiple complementary abstractions of the multi-agent system concentrating on specific aspects of the system. In relation to these abstractions, several system views are defined that allow a graphical representation of the selected aspects of the system state and its dynamic behaviour, It is argued that these views support effectively understanding of system behaviour and thus efficient debugging. Examples of an implementation of this approach in the frame of the Esprit Project ARCHON are also discussed.

关键词： distributed ARTIFICIAL INTELLIGENCE distributed COMPUTING MULTIAGENT SYSTEMS VISUALIZATION distributed debugging DEVELOPMENT ENVIRONMENTS

来源：评论

学校读者我要写书评

暂无评论

Deadlock-free incremental replay of message-passing programs

引用

JOURNAL OF PARALLEL AND distributed COMPUTING 2001年第5期61卷 667-678页

作者： Zambonelli, F Netzer, RHB Univ Modena Dipartimento Sci Ingn I-41100 Modena Italy IP Performance Inc N Scituate RI 02857 USA

To support incremental replay of message-passing applications. processes must periodically checkpoint and the content of some messages must be logged, to break dependencies of the current slate of the execution on past events. This paper shows that known adaptive logging algorithms are likely to introduce deadlocks in replay, and we introduce a new algorithm that: (i) prevents deadlocks in replay and (ii) enables the tuning of its behavior to meet specific user needs. (C) 2001 Academic Press.

关键词： distributed debugging checkpointing incremental replay adaptive logging deadlock

来源：评论

学校读者我要写书评

暂无评论

Predicate control: synchronization in distributed computations with look-ahead

引用

JOURNAL OF PARALLEL AND distributed COMPUTING 2004年第2期64卷 219-237页

作者： Tarafdar, A Garg, VK Akami Technol Cambridge MA 02142 USA Univ Texas Dept Elect & Comp Engn Austin TX 78712 USA

The predicate control problem involves synchronizing a distributed computation to maintain a given global predicate. In contrast with many popular distributed synchronization problems such as mutual exclusion, readers writers, and dining philosophers, predicate control assumes a look-ahead, so that the computation is an off-line rather than an on-line input. Predicate control is targeted towards applications such as rollback recovery, debugging, and optimistic computing, in which such computation look-ahead is natural. We define predicate control formally and show that, in its full generality, the problem is NP-complete. We find efficient solutions for some important classes of predicates including "disjunctive predicates", "mutual exclusion predicates", and "readers writers predicates". For each class of predicates, we determine the necessary and sufficient conditions for solving predicate control and describe an efficient algorithm for determining a synchronization strategy. In the case of "independent mutual exclusion predicates", we determine that predicate control is NP-complete and describe an efficient algorithm that finds a solution under certain constraints. (C) 2003 Elsevier Inc. All rights reserved.

关键词： predicate control synchronization distributed computation predicate detection mutual exclusion rollback recovery distributed debugging

来源：评论

学校读者我要写书评

暂无评论

Intractability results in predicate detection

引用

INFORMATION PROCESSING LETTERS 2005年第6期94卷 277-282页

作者： Kashyap, S Garg, VK Univ Texas Dept Elect & Comp Engn Austin TX 78712 USA

It has been shown that global predicate detection in a distributed computation is an NP-complete problem in general. However, efficient predicate detection algorithms exist for some subclasses of predicates, such as stable predicates, observer-independent predicates, conjunctions of local predicates, channel predicates, etc. We show here that the problem of deciding whether a given predicate is a member of any of these tractable subclasses is NP-hard in general. We also explore the tractability of linear and regular predicates. In particular, we show that, unless RP = NP, there is no polynomial-time algorithm to detect possibly: B for linear and regular predicates B. (c) 2005 Elsevier B.V. All rights reserved.

关键词： distributed computing computational complexity predicate detection distributed debugging

来源：评论

学校读者我要写书评

暂无评论

SOME OPTIMAL-ALGORITHMS FOR DECOMPOSED PARTIALLY ORDERED SETS

引用

INFORMATION PROCESSING LETTERS 1992年第1期44卷 39-43页

作者： GARG, VK Department of Electrical and Computer Engineering University of Texas Austin TX 78712-1084 USA

We describe two problems and their optimal solutions for partially ordered sets. We first describe an optimal algorithm for computing the largest anti-chain of a partially ordered set given its decomposition into its chains. Our algorithm requires O(n2m) comparisons where n is the number of chains and m is the maximum number of elements in any chain. We also give an adversary argument to prove that this is a lower bound. Our second problem requires us to find if the given poset is a total order. Our optimal algorithm requires 0(mn log n) comparisons. These algorithms have applications in distributed debugging and recovery in distributed systems.

关键词： PARTIALLY-ORDERED SETS distributed debugging distributed COMPUTING COMBINATORIAL PROBLEMS

来源：评论

学校读者我要写书评

暂无评论

Enhancing distributed event predicate detection algorithms

引用

IEEE TRANSACTIONS ON PARALLEL AND distributed SYSTEMS 1996年第7期7卷 673-676页

作者： Chiou, HK Korfhage, W DE SHAW & CO LP CAMBRIDGEMA 02139

Recently published algorithms for matching concurrent sets of events have the problem of unbounded message queue growth if events arrive in an undesirable order. This paper presents some algorithms that mitigate this ... 详细信息

关键词： event predicate concurrent events distributed debugging distributed algorithm causal dependency

来源：评论

学校读者我要写书评

暂无评论

Detecting global predicates in distributed systems with clocks

引用

distributed COMPUTING 2000年第2期13卷 85-98页

作者： Stoller, SD Indiana Univ Dept Comp Sci Bloomington IN 47405 USA

This paper proposes a framework for detecting global state predicates in systems of processes with approximately-synchronized real-time clocks. Timestamps from these clocks are used to define two orderings on events: "definitely occurred before" and "possibly occurred before". These orderings lead naturally to definitions of 3 distinct detection modalities, i.e., 3 meanings of "predicate Phi held during a computation", namely: Poss (db) Phi ("Phi, possibly held"), Def (db) Phi, ("Phi definitely held"), and Inst Phi ("Phi definitely held in a specific global state"). This paper defines these modalities and gives efficient algorithms for detecting them. The algorithms an: based on algorithms of Garg and Waldecker, Alagar and Venkatesan, Cooper and Marzullo, and Fromentin and Raynal. Complexity analysis shows that under reasonable assumptions, these real-time-clock-based detection algorithms are less expensive than detection algorithms based on Lamport's happened-before ordering. Sample applications are given to illustrate the benefits of this approach.

关键词： global predicate detection consistent global states distributed debugging real-time monitoring

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：