This conference proceedings contains 19 papers. The following topics are dealt with: recovery;replication;network architecture;reliable communication;performance analysis;evaluation and modeling;and simulation and tes...
详细信息
This conference proceedings contains 19 papers. The following topics are dealt with: recovery;replication;network architecture;reliable communication;performance analysis;evaluation and modeling;and simulation and testing.
A major obstacle in implementing a rollback recovery scheme for fault tolerance in a concurrent distributed system is the domino effect. A low overhead checkpointing scheme is proposed to prevent this effect. Each pro...
详细信息
A major obstacle in implementing a rollback recovery scheme for fault tolerance in a concurrent distributed system is the domino effect. A low overhead checkpointing scheme is proposed to prevent this effect. Each process saves its state periodically. The state-save synchronization among processes is implemented by bounding clock drifts. A communication protocol that assures that all saved states are consistent is developed.
HGPSS, a simulation language and environment aimed specifically at distributedsystems. is described. HGPSS is upwardly compatible with GPSS, adding a number of features for the modeling of distributeddatabase system...
详细信息
HGPSS, a simulation language and environment aimed specifically at distributedsystems. is described. HGPSS is upwardly compatible with GPSS, adding a number of features for the modeling of distributeddatabasesystems. The incorporation of these primitives reduces the complexity of the task of the simulation programmer. In addition, HGPSS is a portable system, thus permitting the use of more-powerful processors. HGPSS presents a novel approach to simulation, namely, that of incorporating application-specific functionality into the basic tools. By enriching the simulation language with constructs designed explicitly for an application environment, the task of the modeler can be simplified substantially. Furthermore, for situations in which general algorithmic facilities are necessary, a direct C interface is provided. A software modeling environment for determining the performance of various distributeddatabasesystems.is described, which provides the user with the tools needed to model and analyze such a system.
This paper presents a software modeling environment for estimating the performance of distributeddatabasesystems. This tool supports a simulation language, HGPSS, which comprises various simulation primitives, conta...
详细信息
ISBN:
(纸本)0818619465
This paper presents a software modeling environment for estimating the performance of distributeddatabasesystems. This tool supports a simulation language, HGPSS, which comprises various simulation primitives, contains a collection of network modules, and allows for the collection of statistics. This provides an overview of the HGPSS environment emphasizing its applicability to the modeling of distributeddatabases.
Multicast communication in a distributed system connected by a local area network can increase parallelism, and it can also provide a greater functionality than one-to-one communication. In the authors' multicast ...
详细信息
Multicast communication in a distributed system connected by a local area network can increase parallelism, and it can also provide a greater functionality than one-to-one communication. In the authors' multicast protocol, the sender directs a message to a named group of receivers, which can be specified by function without requiring the sender to know the specific members of the group. Each host's kernel in the network can respond to every group message sent, providing various levels of reliability. It was found that the overhead of providing dependable multicast over a single local area network was very small, mainly because the protocol operates at the kernel level rather than the user level. Several forms of this multicast communication, expressed as simple message-passing communication primitives, are described, and the effectiveness of the protocol is evaluated using an example of a distributed algorithm. Performance analyses and actual performance data for the protocol are presented.
The use of local area network (LAN) technology for distributed process control is addressed. This paper highlights some current problems, and focuses on some unique new aspects of distributed control which are necessa...
详细信息
The use of local area network (LAN) technology for distributed process control is addressed. This paper highlights some current problems, and focuses on some unique new aspects of distributed control which are necessary for providing a properly integrated approach to distributed supervisory control systems. A subscription service satisfies the need to move data efficiently between supervisory computers, taking into account the possibilities of multiple sources (redundancy) and multiple users of the data, making use of the broadcast capabilities of Ethernet to minimise overhead. A hot-standby pair configuration allows two supervisory computers on the network to operate in a redundant manner, providing high reliability. A global facility secure service allows legislating or locking of access to a shared resource (such as part of a database) on another supervisory computer. This permits multiple operations (which are divisible in time) to be safely done. A virtual display system allows a comprehensive set of operator displays on one supervisory computer to be used by an operator display station on another computer. The concept of servicing the display over the network, rather than transferring the data, results in a reduced load, both for the network and for the computers.
In real-time databasesystems. a transaction may not have enough time to complete. In such cases, partial, or imprecise, results can still be produced. The authors have proposed an imprecise result mechanism for produ...
详细信息
ISBN:
(纸本)0818608757
In real-time databasesystems. a transaction may not have enough time to complete. In such cases, partial, or imprecise, results can still be produced. The authors have proposed an imprecise result mechanism for producing partial results, which is used here to implement timing error recovery in real-time databasesystems. They also present a model of real-time systems.that distinguishes the external data consistency from the internal data consistency maintained by non-real-time systems. Providing a timely response may require sacrificing internal consistency. The authors discuss three examples that have different requirements of data consistency and present algorithms for implementing them.
The authors deal with the task allocation problem in distributedsoftware design, with the goal of maximizing the system reliability. A quantitative problem model, algorithms for optimal and suboptimal solutions, and ...
详细信息
ISBN:
(纸本)0818608757
The authors deal with the task allocation problem in distributedsoftware design, with the goal of maximizing the system reliability. A quantitative problem model, algorithms for optimal and suboptimal solutions, and simulation results are provided and discussed. Because the authors use a new allocation goal--to maximize system reliability--this study complements the existing body of knowledge in task allocation.
The proceeding contains 21 papers. The following topics are dealt with: recovery in distributedsystems.managing replication and network partition;fault-tolerance techniques;fault-tolerant protocols;voting and fault d...
详细信息
ISBN:
(纸本)0818608757
The proceeding contains 21 papers. The following topics are dealt with: recovery in distributedsystems.managing replication and network partition;fault-tolerance techniques;fault-tolerant protocols;voting and fault diagnosis;experimental systems.and, consistency maintenance.
The design and implementation of an experimental fault-tolerant distributeddatabase management system is described. The system provides a logically integrated view of data with distribution transparency and a control...
详细信息
ISBN:
(纸本)0818608757
The design and implementation of an experimental fault-tolerant distributeddatabase management system is described. The system provides a logically integrated view of data with distribution transparency and a controlled data replication. A commitment protocol used to guarantee atomicity of update operations is discussed. Efficient algorithms used to recover a site from a failure and restore data consistency are described. Recovery can be interleaved with the processing of regular database transactions and does not seriously limit the availability of data. The proposed solutions to the problems of fault recovery are designed to take advantage of the properties of a high-bandwidth local area network.
暂无评论