The authors address the problem of state inconsistencies (i. e. , interacting processes having different and inconsistent views of one another) that arise at the kernel level of distributedsystems.based on local area...
详细信息
ISBN:
(纸本)0818606908
The authors address the problem of state inconsistencies (i. e. , interacting processes having different and inconsistent views of one another) that arise at the kernel level of distributedsystems.based on local area networks. Such systems.are particularly susceptible to state inconsistencies becaue entities are highly autonomous and thus may fail independently. The problem is compounded by the inherent delays and errors in communicating events between machines in the network. A description is given of three common classes of events that may cause state inconsistencies: (1) failures of processes, machines, and/or the network;(2) new machines joining or exiting from the system;and (3) processes or hosts migrating from one machine to another in the network. Systematic solutions to the problems, based mainly on the concept of kernel-supported process aliases, are presented. The solutions are structured and easy to understand.
Focus is on recovery of transactions in a distributed DB/DC system. The objective is to use transaction-level structural information to eliminate costly lower-level handshaking protocols, eliminate the need for any ce...
详细信息
ISBN:
(纸本)0818606908
Focus is on recovery of transactions in a distributed DB/DC system. The objective is to use transaction-level structural information to eliminate costly lower-level handshaking protocols, eliminate the need for any centralized recovery management mechanism by making recovery actions local to interacting components, and eliminate propagation of recovery actions to more than one antecedent or precedent component. Progressive recovery is a way of tracking the progress of a transaction to meet the above objective. Transaction processing involves different execution stages (DC, DB, followed by the DC), perhaps on different processors. Some stages make database changes and others are purely transformations of messages. The latter permit re-executions without side effects. The former must be well protected from re-executions. In contrast with optimistic recovery schemes, progressive recovery does not track communication and state dependencies.
The following topics are dealt with: real-time distributed programming systems.architecture and interconnection schemes;fault tolerance;reliability estimation and performance modeling;performance analysis;intercommuni...
详细信息
ISBN:
(纸本)0818607491
The following topics are dealt with: real-time distributed programming systems.architecture and interconnection schemes;fault tolerance;reliability estimation and performance modeling;performance analysis;intercommunication protocols;operative systems.dynamic and distributed scheduling;task allocation and load balancing;real-time operating system for nuclear power plant computer;real-time juggling robot;and real-time direct kinematics on a VLSI chip. 30 papers were presented, all of which are published in full in the present proceedings. Abstracts of individual papers can be found under the classification codes in this or other issues.
The goal of checkpointing in database management systems.is to save database states on a separate secure device so that the database can be recovered when errors and failures occur. Recently, the possibility of having...
详细信息
ISBN:
(纸本)0818607491
The goal of checkpointing in database management systems.is to save database states on a separate secure device so that the database can be recovered when errors and failures occur. Recently, the possibility of having a checkpointing mechanism which does not interfere with the transaction processing has been studied. The property of noninterference is highly desirable in real-time applications, where restricting transaction activity during the checkpointing operation is in many cases not feasible. The practicality of a noninterfering checkpointing algorithm is studied here by analyzing the extra workload of the algorithm. The noninterfering checkpointing results in some overhead on the one hand, and increases the system availability on the other hand. For the applications where the ability of continuous processing of transactions is so critical that the blocking of transaction processing for checkpointing is not feasible, it is believed that noninterfering checkpointing provides a practical solution to the problem of constructiing globally consistent states in distributeddatabasesystems.
An algorithm for dynamically scheduling groups of tasks with arbitrary precedence constraints in loosely coupled distributedsystems.is presented. Tasks in each group are required to be executed completely or not at a...
详细信息
ISBN:
(纸本)0818607491
An algorithm for dynamically scheduling groups of tasks with arbitrary precedence constraints in loosely coupled distributedsystems.is presented. Tasks in each group are required to be executed completely or not at all, and must be finished before a specified deadline. If a newly arriving group cannot be scheduled at a local node, an attempt is made to distribute tasks in the group throughout the network. The approach consists of (a) a preprocessing algorithm;(b) a distributed scheduling algorithm;and (c) a compression algorithm. A complexity analysis of the algorithms is presented.
One-to-many (group) interprocess communication is useful in many real-time distributed applications. It may be conveniently and efficiently realized using the multicast feature available in contemporary local area net...
详细信息
ISBN:
(纸本)0818607491
One-to-many (group) interprocess communication is useful in many real-time distributed applications. It may be conveniently and efficiently realized using the multicast feature available in contemporary local area networks. A kernel model which supports reliable group communication in a distributed computing environment is presented. New semantic tools which capture the nondeterminism of the underlying low-level events concisely are introduced and a process alias-based structuring technique for the kernel to handle the reliability problems that may arise during group communication is described. The scheme works by maintaining a close association between group messages and their corresponding reply messages. Sample programs illustrate how the semantic tools may be used.
A simple real-time load-sharing algorithm is presented in which the decision to execute a job locally or remotely is made dynamically, on the basis of a simple threshold policy. The selection of destination node at wh...
详细信息
ISBN:
(纸本)0818607491
A simple real-time load-sharing algorithm is presented in which the decision to execute a job locally or remotely is made dynamically, on the basis of a simple threshold policy. The selection of destination node at which the job is to be executed is made probabilistically and independently of the current system state. An approximate analytic performance model is developed and validated through simulation. The performance results suggest that, over a relatively wide range of system parameters, the performance of the algorithm is substantially better than that of extremely simple algorithms and often close to that of a theoretically optimum algorithm.
For systems.containing a large number of processing elements (PEs), the capability to recover from a PE fault is important. The dynamic redundancy (DR) network can tolerate faults in the network and support a system t...
详细信息
ISBN:
(纸本)0818607491
For systems.containing a large number of processing elements (PEs), the capability to recover from a PE fault is important. The dynamic redundancy (DR) network can tolerate faults in the network and support a system that tolerates PE faults without degradation in performance by adding spare PEs, while retaining the full capability of a multistage cube. A variation of the DR network, the reduced DR (RDR) network, is presented which can be implemented more cost effectively while retaining most of the advantages of the DR. The reliability of systems.containing the DR or RDR networks and spare PEs and the reliability of systems.with no spare PEs are also estimated and compared. It is shown that using the DR or RDR network and spare PEs in a system can achieve better system reliability over a wide range of N, where N is the number of functioning PEs, than can using any kind of N multiplied by N fault-tolerant network and no spares.
The authors present and analyze a receiver-initiated scheduling algorithm for distributed soft real-time systems. The algorithm is based on a 'poll when underloaded' approach. Using simulation, the deadline mi...
详细信息
ISBN:
(纸本)0818607491
The authors present and analyze a receiver-initiated scheduling algorithm for distributed soft real-time systems. The algorithm is based on a 'poll when underloaded' approach. Using simulation, the deadline miss ratio and mean lateness of the algorithm are derived for different workloads and overheads. The performance profile of the algorithm shows stable behavior when load or management overheads are high. Its performance is compared with that of sender-initiated scheduling algorithm that uses a 'first-fit' strategy. The sender-initiated algorithm outperforms the receiver-initiated algorithm when the load is light. The results show that although there is a clear need for dynamic deadline-oriented scheduling at medium to high loads, a simple algorithm would suffice at low loads.
To meet the growing demand for online transaction processing, several DB (database management) and DC (data communication management) subsystems.can be coupled together to form a distributed DB/DC system. A recovery p...
详细信息
ISBN:
(纸本)0818607491
To meet the growing demand for online transaction processing, several DB (database management) and DC (data communication management) subsystems.can be coupled together to form a distributed DB/DC system. A recovery protocol is needed not only to provide for the recovery of transactions affected by the failure, but also to localize recovery operations. Two such protocols based on a progressive approach, namely, a synchronous progressive and an asynchronous progressive protocol, along with a pessimistic protocol, are analyzed. Their performance during normal transaction processing is contrasted with that of a transaction processing system without any recovery protocol. A queuing model is developed and simulated to predict the transaction response time. The progressive recovery approach is found to reduce normal processing overhead and lead to performance improvement over the pessimistic approach.
暂无评论