Nowadays, there are many protocols able to cope with process crashes, but, unfortunately, a process crash represents only a particular faulty behavior. Handling tougher failures (e.g. sending omission failures, receiv...
详细信息
Nowadays, there are many protocols able to cope with process crashes, but, unfortunately, a process crash represents only a particular faulty behavior. Handling tougher failures (e.g. sending omission failures, receive omission failures, arbitrary failures) is a real practical challenge due to malicious attacks or unexpected software errors. This is usually achieved either by changing, in an ad hoc manner, the code of a crash resilient protocol or by devising a new protocol from scratch. This paper proposes an alternative methodology to detect processes experiencing arbitrary failures. On this basis, it introduces the notions of liveness failure detector and safety failure detector as two independent software components. With this approach, the nature of failures experienced by processes becomes transparent to the protocol using the components. This methodology brings a few advantages: it makes possible to increase the resilience of a protocol designed in a crash failure context without changing its code by concentrating only on the design of a few well-specified components, and second, it clearly separates the task of designing the protocol from the task of detecting faulty processes, a methodological improvement. Finally, the feasibility of this approach is shown, by providing an implementation of liveness failure detectors and of safety failure detectors for two protocols: one solving the consensus, and the second solving the problem of globaldatacomputation. (c) 2007 Elsevier B.V. All rights reserved.
暂无评论