The authors propose two protocols for transaction processing in quasi-partitioned databases. The protocols are pessimistic in that they permit the execution of update transactions in exactly one partition. The first p...
详细信息
ISBN:
(纸本)0818608757
The authors propose two protocols for transaction processing in quasi-partitioned databases. The protocols are pessimistic in that they permit the execution of update transactions in exactly one partition. The first protocol is defined for a fully partition-replicated database in which every partition contains a copy of every data object. The second protocol is defined for a partially partition-replicated database in which some objects have no copies in some partitions. Both protocols improve their major performance measures linearly with the backup link speed but are not visibly affected by either duration of the partitioning or the database size. This is a desirable property, since the backup link speed is the only controllable parameter.
Highly reliable and effective failure detection and isolation (FDI) software is crucial in modern avionics systems.that tolerate hardware failures in real time. The FDI function is an excellent opportunity for applyin...
详细信息
ISBN:
(纸本)0818608757
Highly reliable and effective failure detection and isolation (FDI) software is crucial in modern avionics systems.that tolerate hardware failures in real time. The FDI function is an excellent opportunity for applying the principal of software design diversity to the fullest, i.e., algorithm diversity, in order to provide gains in functional performance as well as potentially enhancing the reliability of the software. The authors examine algorithm diversity applied to the redundancy management software for a hardware fault-tolerant sensor array. Results of an experiment are presented that show the performance gains that can be provided by utilizing the consensus of three diverse algorithms for sensor FDI.
An approach to checkpointing and rollback recovery in a distributed computing system using a common time base is proposed. First, a common time base is established in the system using a hardware clock synchronization ...
详细信息
ISBN:
(纸本)0818608757
An approach to checkpointing and rollback recovery in a distributed computing system using a common time base is proposed. First, a common time base is established in the system using a hardware clock synchronization algorithm. This common time base is coupled with a pseudorecovery block approach to develop a checkpointing algorithm that has the following advantages: (i) maximum process autonomy, (ii) no wait for commitment for establishing recovery lines, (iii) fewer messages to be exchanged, and (iv) less memory requirement.
The authors present an election protocol that does not assume an underlying ring structure and that tolerates failures, including lost messages and network partitioning, during the execution of the protocol itself. Th...
详细信息
ISBN:
(纸本)0818608757
The authors present an election protocol that does not assume an underlying ring structure and that tolerates failures, including lost messages and network partitioning, during the execution of the protocol itself. The major problem to be solved is that when nodes cannot communicate with one another or messages are lost, a conflict in resolving the election will often arise. In the authors' approach, the conflict is detected by the cohorts (noncandidate participants in the election). Related election protocols are discussed, and the system model is described together with assumptions about the communication subsystem. The protocol and the lost-message situations are then examined.
The authors examine optimal task allocation for redundant, heterogeneous distributed computer systems. It is assumed that the systems.under consideration are required for execution of long-term mission applications su...
详细信息
ISBN:
(纸本)0818608730
The authors examine optimal task allocation for redundant, heterogeneous distributed computer systems. It is assumed that the systems.under consideration are required for execution of long-term mission applications such as space flights. A formal description of the problem is given and formal, quantitative task allocation models are derived. Both an optimal allocation algorithm and an approximating optimal algorithm are derived, discussed, and compared by simulation results. For the latter case, a formula for computing the error associated with the approximation used is also presented.
A description is given of the results of a study of methods of achieving fault tolerance in the Clouds system and, in particular, of achieving increased availability of objects. The problems explored in this work, the...
详细信息
ISBN:
(纸本)0818608757
A description is given of the results of a study of methods of achieving fault tolerance in the Clouds system and, in particular, of achieving increased availability of objects. The problems explored in this work, the model of distributed computation in which the problems posed by the research were examined (the Clouds system), the tools that were used to address these problems (the Aeolus programming language), and some related research are briefly described. The authors present a methodology for achieving available services by conversion of resilient single-site implementations into replicated implementations. A mechanism with which they propose to support this methodology, called distributed locking (DL), is presented. A description is also given of a linguistic feature for the specification of the availability properties of an object replicated via DL. The language runtime support features (primitives) required for DL and the operating system support needed for these features are presented.
A checkpoint algorithm is presented that benefits from the research in concurrency control, commit, and site recovery algorithms in transaction processing. In the authors' approach a number of checkpointing proces...
详细信息
ISBN:
(纸本)0818608757
A checkpoint algorithm is presented that benefits from the research in concurrency control, commit, and site recovery algorithms in transaction processing. In the authors' approach a number of checkpointing processes, a number of rollback processes, and computations on operational processes can proceed concurrently while tolerating the failure of an arbitrary number of processes. Each process takes checkpoints independently. During recovery after a failure, a process invokes a two-phase rollback algorithm. It collects information about relevant message exchanges in the system in the first phase and uses it in the second phase to determine both the set of processes that must roll back and the set of checkpoints up to which rollback must occur. Concurrent rollbacks are completed in the order of the priorities of the recovering processes. The proposed solution is optimistic in the sense that it does well if failures are infrequent by minimizing overhead during normal processing.
In real-time databasesystems. a transaction may not have enough time to complete. In such cases, partial, or imprecise, results can still be produced. The authors have proposed an imprecise result mechanism for produ...
详细信息
In real-time databasesystems. a transaction may not have enough time to complete. In such cases, partial, or imprecise, results can still be produced. The authors have proposed an imprecise result mechanism for producing partial results, which is used to implement timing error recovery in real-time databasesystems. They also present a model of real-time systems.that distinguishes the external data consistency from the internal data consistency maintained by non-real-time systems. Providing a timely response may require sacrificing internal consistency. The authors discuss three examples that have different requirements of data consistency and present algorithms for implementing them.< >
The following topics are dealt with: reliability issues in distributed operating systems.communications and control;distributedsystems.replicated data reliability;object-based systems.concurrency and synchronization;...
详细信息
ISBN:
(纸本)0818607378
The following topics are dealt with: reliability issues in distributed operating systems.communications and control;distributedsystems.replicated data reliability;object-based systems.concurrency and synchronization;representing faulty distributedsystems.as nondeterministic sequential systems.algorithms for maintaining data availability and agreement. 20 papers were presented, all of which are published in full in the present proceedings.
It is suggested that it is helpful to study reliable distributedsystems.from the point of view of nondeterministic sequential systems. The faulty and distributed nature of systems.can be captured by nondeterminism, s...
详细信息
ISBN:
(纸本)0818607378
It is suggested that it is helpful to study reliable distributedsystems.from the point of view of nondeterministic sequential systems. The faulty and distributed nature of systems.can be captured by nondeterminism, so there is a unity to the study of faulty and fault-free systems.faulty systems.are at one end of the spectrum and fault-free systems.are at the other. Similarly, there is a unity to the study of distributed and sequential systems. It is suggested that all systems. whether faulty or fault-free, whether distributed or sequential, can be handled in a unified way by treating the system as a nondeterministic sequential program.
暂无评论