The proceeding contains 21 papers. The following topics are dealt with: recovery in distributedsystems.managing replication and network partition;fault-tolerance techniques;fault-tolerant protocols;voting and fault d...
详细信息
ISBN:
(纸本)0818608757
The proceeding contains 21 papers. The following topics are dealt with: recovery in distributedsystems.managing replication and network partition;fault-tolerance techniques;fault-tolerant protocols;voting and fault diagnosis;experimental systems.and, consistency maintenance.
Fault-tolerant distributed algorithms that are designed to reach agreement have been the subject of a great deal of recent study, primarily focussed on the Byzantine agreement paradigm. The author explores new paradig...
详细信息
ISBN:
(纸本)0818606908
Fault-tolerant distributed algorithms that are designed to reach agreement have been the subject of a great deal of recent study, primarily focussed on the Byzantine agreement paradigm. The author explores new paradigms and problems that arise in the context of maintaining agreement, rather than reaching agreement in an isolated instance. The emphasis is on open problem areas rather than on specific solutions.
A two-step structure is proposed for the decision-making process that is needed when multiple versions of software are utilized to combat the effects of design errors. In addition to providing a simple framework for i...
详细信息
ISBN:
(纸本)0818606908
A two-step structure is proposed for the decision-making process that is needed when multiple versions of software are utilized to combat the effects of design errors. In addition to providing a simple framework for implementing a variety of adjudication strategies, the structure makes it possible to give a uniform description which encompasses the range of published solutions.
There is an increasing demand on using off-the-shelf (OTS) software components to facilitate the development of softwaresystems.distributedsoftwaresystems.are often too complex to develop from scratch. Therefore, ...
详细信息
ISBN:
(纸本)0769502911
There is an increasing demand on using off-the-shelf (OTS) software components to facilitate the development of softwaresystems.distributedsoftwaresystems.are often too complex to develop from scratch. Therefore, distributed system designers are motivated to deploy trusted software components resulting in a component-based nature of the system. Using OTS components could indicate a more reliable software. However, the sensitivity of the system reliability to component reliabilities needs further investigation based on reliability analysis models and techniques that are suitable for distributed component-based software. The distributed nature of these systems.further coerces the analysis technique to incorporate link and delivery channel reliabilities. This paper proposes a reliability analysis technique for distributedsoftwaresystems. The technique is based on scenarios that are modeled as sequence diagrams. Using scenarios, we construct Component-Dependency Graphs (CDG). CDGs have been introduced for reliability analysis of component-based systems. They are extended here to serve the complex nature of distributedsystems.by applying nesting and hierarchy. CDGs include component and link reliabilities, which are treated as first class elements of the model. Based on CDGs, we present an algorithm to analyze the sensitivity of system reliability to reliabilities of its components, subsystems. and links. The proposed analysis technique is useful in identifying critical components and critical component links. An example based on medical informatics standard is presented to illustrate our methodology.
In this paper we describe an infrastructure that provides increased reliability for three-tier applications, transparently, using commercial off-the-shelf application servers and databasesystems. In this infrastructu...
详细信息
In this paper we describe an infrastructure that provides increased reliability for three-tier applications, transparently, using commercial off-the-shelf application servers and databasesystems. In this infrastructure the application servers are actively replicated to protect the business logic processing. Replicating the transaction coordinator renders the two-phase commit protocol non-blocking and thus, avoids potentially long service disruptions caused by coordinator failure. A thin interpositioning library provides client-side automatic failover, so that clients know the outcome of their requests. The interaction between the application servers and the database servers is handled through replicated gateways that prevent duplicate requests from reaching the database servers. Aborted transactions, caused by process or communication faults, are automatically retried on the client's behalf.
This paper describes a case study in the testing of distributedsystems. The software under testis a middleware system developed in Java. The full test life cycle is examined including unit testing, integration testin...
详细信息
ISBN:
(纸本)076951300X
This paper describes a case study in the testing of distributedsystems. The software under testis a middleware system developed in Java. The full test life cycle is examined including unit testing, integration testing, and system testing. Where possible, traditional tools and techniques are used to carry out the testing. One aspect where this is not possible is the testing of the low-level concurrency, which is often overlooked when testing commercial distributedsystems. since the middleware or application server is already developed by a third-party and is assumed to operate correctly. This paper examines testing the middleware system itself and therefore, a method for testing the concurrency properties of the system is used. The testing revealed a number of faults and design weaknesses, and showed that, with some adaptation, traditional tools and techniques go a long way in the testing of distributed applications.
This paper presents four models to demonstrate our techniques for optimizing software and hardware reliability for fault-tolerant distributedsystems. The models help us find the optimal system structure while conside...
详细信息
ISBN:
(纸本)0780366158
This paper presents four models to demonstrate our techniques for optimizing software and hardware reliability for fault-tolerant distributedsystems. The models help us find the optimal system structure while considering basic information on reliability and cost of the available software and hardware components. Each model is suitable for a distinct set of conditions or situations. All four models maximize reliability while meeting cost constraints. The Simulated Annealing optimization algorithm is selected to demonstrate system reliability optimization techniques for distributedsystems.because of its flexibility in applying to various problem types with various constraints, as well as its efficiency in computation time. It provides satisfactory reliability results while meeting the constraints.
An approach is presented that will allow database applications to increase availability in the face of network partitions and other communications failures, by permitting a controlled amount of nonserializable databas...
详细信息
ISBN:
(纸本)0818606908
An approach is presented that will allow database applications to increase availability in the face of network partitions and other communications failures, by permitting a controlled amount of nonserializable database activity. The underlying replicated database substrate ensures mutual consistency, without serializability, by timestamping all updates issued by database interactions. Compensating actions, triggered by exception conditions in the database, attempt to correct problems arising from nonserializable execution or notify human agents to investigate and correct the problem. Probabilistic concurrency control uses a controlled amount of inter-site synchronization to reduce the likelihood of nonserializable execution and the burden of compensation, at the cost of slightly reduced availability. This approach, illustrated by means of examples, allows application designers to tailor the system to achieve any desired balance between availability and consistency.
The Byzantine Generals problem involves a system of N processes, t of which may be unreliable. The problem is for the reliable processes to agree on a binary value sent by a 'general' which may itself be one o...
详细信息
The Byzantine Generals problem involves a system of N processes, t of which may be unreliable. The problem is for the reliable processes to agree on a binary value sent by a 'general' which may itself be one of the N processes. If the general sends the same value to each process, then all reliable processes must agree on that value but in any case, they must agree on the same value. An explicit solution is given for a binary value among N equals 3t plus 1 processes, using 2t plus 4 rounds and O(t**3 log t) message bits, where t bounds the number of faulty processes. This solution is easily extended to the general case of N greater than equivalent to 3t plus 1 to give a solution using 2t plus 5 rounds and O(tN plus t**3 log t) message bits.
Middleware-based database replication approaches have emerged in the last few years as an alternative to traditional database replication implemented within the database kernel. A middleware approach enables third par...
详细信息
ISBN:
(纸本)0769526772
Middleware-based database replication approaches have emerged in the last few years as an alternative to traditional database replication implemented within the database kernel. A middleware approach enables third party vendors to provide high availability solutions, a growing practice nowadays in the software industry However, middleware solutions often lack scalability and exhibit a number of consistency and performance issues. The reason is that in most cases the middleware has to handle the database as a black box, and hence, cannot take advantage of the many optimizations implemented in the database kernel. Thus, middleware solutions often reimplement key functionality but cannot achieve the same efficiency as a kernel implementation. Reflection has been proposed during the last decade as a fruitful paradigm to separate non-functional aspects from functional ones, simplifying software development and maintenance whilst fostering reuse. However fully reflective databases are not feasible due to the high cost of reflection. Our claim is that by exposing some minimal database functionality through a lightweight reflective interface, efficient and scalable middleware database replication can be attained. In this paper we explore a wide variety of such lightweight reflective interfaces and discuss what kind of replication algorithms they enable. We also discuss implementation alternatives for some of these interfaces and evaluate their performance.
暂无评论