An analysis is made of a number of interesting actual or potential parallels between the problems and techniques associated with achieving high reliability, and those associated with the provision of security, in dist...
详细信息
ISBN:
(纸本)0818606908
An analysis is made of a number of interesting actual or potential parallels between the problems and techniques associated with achieving high reliability, and those associated with the provision of security, in distributed computing systems.
Before the heralded potential of distributed computer systems.can be realized, the system must be made robust in the face of processor failures. Reassigning the work of a failed processor so that system performance de...
详细信息
ISBN:
(纸本)0818606908
Before the heralded potential of distributed computer systems.can be realized, the system must be made robust in the face of processor failures. Reassigning the work of a failed processor so that system performance degrades gracefully is one of the most important problems in designing reliable distributedsystems. The authors present an algorithm for reassigning the work of a failed processor that attempts to minimize the increased cost caused by the redistribution. This algorithm is based on a technique known as clustering. The authors also present a comprehensive cost function, and discuss its applicability to 'real' systems.
A replicated database system is a distributeddatabase system in which some data objects are stored redundantly at multiple sites to improve the reliability of the system. Without proper control mechanisms, the consis...
详细信息
ISBN:
(纸本)0818606908
A replicated database system is a distributeddatabase system in which some data objects are stored redundantly at multiple sites to improve the reliability of the system. Without proper control mechanisms, the consistency of a replicated database system might be violated. A scheme to increase the reliability as well as the degree of concurrency is described. It allows transactions to operate on a data object if more than one token copies are available. The scheme also exploits the fact that, for recovery reasons, there are two values for one data object. Proof that the proposed scheme guarantees consistency is provided. Some of variations of the scheme are discussed.
This study is concerned with the establishment of a global time base in a distributed real-time system. It is shown that at least two different time references, an approximate global time and an approximate political ...
详细信息
ISBN:
(纸本)0818606908
This study is concerned with the establishment of a global time base in a distributed real-time system. It is shown that at least two different time references, an approximate global time and an approximate political time, must be available in each node. The granularity of the global time is determined by the achievable synchronism of the local real-time clocks. The decisive factor for the accuracy of clock synchronization is the variability of the message delay. A quantitative analysis of the achievable accuracy of clock synchronization in systems.with and without a layered communication architecture is presented. Finally, three functions on the approximate global event times are introduced in order to support the causal analysis of events.
A controversial point in designing a distributed system is whether the user or the system should be responsible for taking actions as a consequence of system failures. The author proposes a dynamic configuration schem...
详细信息
ISBN:
(纸本)0818606908
A controversial point in designing a distributed system is whether the user or the system should be responsible for taking actions as a consequence of system failures. The author proposes a dynamic configuration scheme for runtime reconfiguration of application software that is more flexible than existing proposals. A description is given of a reconfigurable scheme implemented by the operating system which only requires that software components be virtually connected. The authors demonstrate that the scheme increases the reliability and availability of distributedsystems. and compare and contrast this scheme with other similar proposals.
A two-step structure is proposed for the decision-making process that is needed when multiple versions of software are utilized to combat the effects of design errors. In addition to providing a simple framework for i...
详细信息
ISBN:
(纸本)0818606908
A two-step structure is proposed for the decision-making process that is needed when multiple versions of software are utilized to combat the effects of design errors. In addition to providing a simple framework for implementing a variety of adjudication strategies, the structure makes it possible to give a uniform description which encompasses the range of published solutions.
N-modular Redundancy (NMR) protects against arbitrary types of hardware or software failures in a minority of system components, thereby yielding the highest degree of reliability. A study is made of the application o...
详细信息
ISBN:
(纸本)0818606908
N-modular Redundancy (NMR) protects against arbitrary types of hardware or software failures in a minority of system components, thereby yielding the highest degree of reliability. A study is made of the application of NMR, specifically triple modular redundancy (TMR), to general-purpose database processing. The authors discuss the structure and implementation tradeoffs of a TMR system that is 'synchronized' at the transaction level, i. e. , in which complete transactions are distributed to all nodes, where they are processed independently, and only the majority output is accepted. The inherent cost of such a TMR database system is examined on the basis of preliminary performance results from a version implemented on three SUN-2/120 workstations.
Fault-tolerant distributed algorithms that are designed to reach agreement have been the subject of a great deal of recent study, primarily focussed on the Byzantine agreement paradigm. The author explores new paradig...
详细信息
ISBN:
(纸本)0818606908
Fault-tolerant distributed algorithms that are designed to reach agreement have been the subject of a great deal of recent study, primarily focussed on the Byzantine agreement paradigm. The author explores new paradigms and problems that arise in the context of maintaining agreement, rather than reaching agreement in an isolated instance. The emphasis is on open problem areas rather than on specific solutions.
An approach is presented that will allow database applications to increase availability in the face of network partitions and other communications failures, by permitting a controlled amount of nonserializable databas...
详细信息
ISBN:
(纸本)0818606908
An approach is presented that will allow database applications to increase availability in the face of network partitions and other communications failures, by permitting a controlled amount of nonserializable database activity. The underlying replicated database substrate ensures mutual consistency, without serializability, by timestamping all updates issued by database interactions. Compensating actions, triggered by exception conditions in the database, attempt to correct problems arising from nonserializable execution or notify human agents to investigate and correct the problem. Probabilistic concurrency control uses a controlled amount of inter-site synchronization to reduce the likelihood of nonserializable execution and the burden of compensation, at the cost of slightly reduced availability. This approach, illustrated by means of examples, allows application designers to tailor the system to achieve any desired balance between availability and consistency.
One property that makes failures difficult to handle in programs is that the actions of a failed component may occur asynchronously with respect to execution of the program. In this study, an approach to dealing with ...
详细信息
ISBN:
(纸本)0818606908
One property that makes failures difficult to handle in programs is that the actions of a failed component may occur asynchronously with respect to execution of the program. In this study, an approach to dealing with this asynchrony is presented. It is based on treating a failure as an event in a concurrent system of processes, and then integrating failure handling mechanisms into distributed programming languages. The technique is illustrated by considering the class of failures suffered by fail-stop processors, and proposing extensions of the Synchronizing Resources (SR) distributed programming language to handle such failures. Two SR programs using these mechanisms are presented.
暂无评论