Multicast communication in a distributed system connected by a local area network can increase parallelism, and it can also provide a greater functionality than one-to-one communication. In the authors' multicast ...
详细信息
Multicast communication in a distributed system connected by a local area network can increase parallelism, and it can also provide a greater functionality than one-to-one communication. In the authors' multicast protocol, the sender directs a message to a named group of receivers, which can be specified by function without requiring the sender to know the specific members of the group. Each host's kernel in the network can respond to every group message sent, providing various levels of reliability. It was found that the overhead of providing dependable multicast over a single local area network was very small, mainly because the protocol operates at the kernel level rather than the user level. Several forms of this multicast communication, expressed as simple message-passing communication primitives, are described, and the effectiveness of the protocol is evaluated using an example of a distributed algorithm. Performance analyses and actual performance data for the protocol are presented.
The use of local area network (LAN) technology for distributed process control is addressed. This paper highlights some current problems, and focuses on some unique new aspects of distributed control which are necessa...
详细信息
The use of local area network (LAN) technology for distributed process control is addressed. This paper highlights some current problems, and focuses on some unique new aspects of distributed control which are necessary for providing a properly integrated approach to distributed supervisory control systems. A subscription service satisfies the need to move data efficiently between supervisory computers, taking into account the possibilities of multiple sources (redundancy) and multiple users of the data, making use of the broadcast capabilities of Ethernet to minimise overhead. A hot-standby pair configuration allows two supervisory computers on the network to operate in a redundant manner, providing high reliability. A global facility secure service allows legislating or locking of access to a shared resource (such as part of a database) on another supervisory computer. This permits multiple operations (which are divisible in time) to be safely done. A virtual display system allows a comprehensive set of operator displays on one supervisory computer to be used by an operator display station on another computer. The concept of servicing the display over the network, rather than transferring the data, results in a reduced load, both for the network and for the computers.
In real-time databasesystems. a transaction may not have enough time to complete. In such cases, partial, or imprecise, results can still be produced. The authors have proposed an imprecise result mechanism for produ...
详细信息
ISBN:
(纸本)0818608757
In real-time databasesystems. a transaction may not have enough time to complete. In such cases, partial, or imprecise, results can still be produced. The authors have proposed an imprecise result mechanism for producing partial results, which is used here to implement timing error recovery in real-time databasesystems. They also present a model of real-time systems.that distinguishes the external data consistency from the internal data consistency maintained by non-real-time systems. Providing a timely response may require sacrificing internal consistency. The authors discuss three examples that have different requirements of data consistency and present algorithms for implementing them.
The authors deal with the task allocation problem in distributedsoftware design, with the goal of maximizing the system reliability. A quantitative problem model, algorithms for optimal and suboptimal solutions, and ...
详细信息
ISBN:
(纸本)0818608757
The authors deal with the task allocation problem in distributedsoftware design, with the goal of maximizing the system reliability. A quantitative problem model, algorithms for optimal and suboptimal solutions, and simulation results are provided and discussed. Because the authors use a new allocation goal--to maximize system reliability--this study complements the existing body of knowledge in task allocation.
The proceeding contains 21 papers. The following topics are dealt with: recovery in distributedsystems.managing replication and network partition;fault-tolerance techniques;fault-tolerant protocols;voting and fault d...
详细信息
ISBN:
(纸本)0818608757
The proceeding contains 21 papers. The following topics are dealt with: recovery in distributedsystems.managing replication and network partition;fault-tolerance techniques;fault-tolerant protocols;voting and fault diagnosis;experimental systems.and, consistency maintenance.
The design and implementation of an experimental fault-tolerant distributeddatabase management system is described. The system provides a logically integrated view of data with distribution transparency and a control...
详细信息
ISBN:
(纸本)0818608757
The design and implementation of an experimental fault-tolerant distributeddatabase management system is described. The system provides a logically integrated view of data with distribution transparency and a controlled data replication. A commitment protocol used to guarantee atomicity of update operations is discussed. Efficient algorithms used to recover a site from a failure and restore data consistency are described. Recovery can be interleaved with the processing of regular database transactions and does not seriously limit the availability of data. The proposed solutions to the problems of fault recovery are designed to take advantage of the properties of a high-bandwidth local area network.
The authors propose two protocols for transaction processing in quasi-partitioned databases. The protocols are pessimistic in that they permit the execution of update transactions in exactly one partition. The first p...
详细信息
ISBN:
(纸本)0818608757
The authors propose two protocols for transaction processing in quasi-partitioned databases. The protocols are pessimistic in that they permit the execution of update transactions in exactly one partition. The first protocol is defined for a fully partition-replicated database in which every partition contains a copy of every data object. The second protocol is defined for a partially partition-replicated database in which some objects have no copies in some partitions. Both protocols improve their major performance measures linearly with the backup link speed but are not visibly affected by either duration of the partitioning or the database size. This is a desirable property, since the backup link speed is the only controllable parameter.
Highly reliable and effective failure detection and isolation (FDI) software is crucial in modern avionics systems.that tolerate hardware failures in real time. The FDI function is an excellent opportunity for applyin...
详细信息
ISBN:
(纸本)0818608757
Highly reliable and effective failure detection and isolation (FDI) software is crucial in modern avionics systems.that tolerate hardware failures in real time. The FDI function is an excellent opportunity for applying the principal of software design diversity to the fullest, i.e., algorithm diversity, in order to provide gains in functional performance as well as potentially enhancing the reliability of the software. The authors examine algorithm diversity applied to the redundancy management software for a hardware fault-tolerant sensor array. Results of an experiment are presented that show the performance gains that can be provided by utilizing the consensus of three diverse algorithms for sensor FDI.
An approach to checkpointing and rollback recovery in a distributed computing system using a common time base is proposed. First, a common time base is established in the system using a hardware clock synchronization ...
详细信息
ISBN:
(纸本)0818608757
An approach to checkpointing and rollback recovery in a distributed computing system using a common time base is proposed. First, a common time base is established in the system using a hardware clock synchronization algorithm. This common time base is coupled with a pseudorecovery block approach to develop a checkpointing algorithm that has the following advantages: (i) maximum process autonomy, (ii) no wait for commitment for establishing recovery lines, (iii) fewer messages to be exchanged, and (iv) less memory requirement.
The authors present an election protocol that does not assume an underlying ring structure and that tolerates failures, including lost messages and network partitioning, during the execution of the protocol itself. Th...
详细信息
ISBN:
(纸本)0818608757
The authors present an election protocol that does not assume an underlying ring structure and that tolerates failures, including lost messages and network partitioning, during the execution of the protocol itself. The major problem to be solved is that when nodes cannot communicate with one another or messages are lost, a conflict in resolving the election will often arise. In the authors' approach, the conflict is detected by the cohorts (noncandidate participants in the election). Related election protocols are discussed, and the system model is described together with assumptions about the communication subsystem. The protocol and the lost-message situations are then examined.
暂无评论