It is suggested that it is helpful to study reliable distributedsystems.from the point of view of nondeterministic sequential systems. The faulty and distributed nature of systems.can be captured by nondeterminism, s...
详细信息
ISBN:
(纸本)0818607378
It is suggested that it is helpful to study reliable distributedsystems.from the point of view of nondeterministic sequential systems. The faulty and distributed nature of systems.can be captured by nondeterminism, so there is a unity to the study of faulty and fault-free systems.faulty systems.are at one end of the spectrum and fault-free systems.are at the other. Similarly, there is a unity to the study of distributed and sequential systems. It is suggested that all systems. whether faulty or fault-free, whether distributed or sequential, can be handled in a unified way by treating the system as a nondeterministic sequential program.
Recovery provisions in a distributed system are considered and issues of the reliability of software design are examined. As there is no generally valid system for recovery provision design, the provisions are reviewe...
详细信息
Recovery provisions in a distributed system are considered and issues of the reliability of software design are examined. As there is no generally valid system for recovery provision design, the provisions are reviewed. The cost of testing, documentation, operator training, and interface administration that are required for proper operation of the provisions are considered.
A technique is described for implementing k-resilient objects, i. e. , distributed objects that remain available, and whose operations are guaranteed to progress to completion, despite up to k site failures. The imple...
详细信息
ISBN:
(纸本)0818605642
A technique is described for implementing k-resilient objects, i. e. , distributed objects that remain available, and whose operations are guaranteed to progress to completion, despite up to k site failures. The implementation is derived from the object specification automatically, and does not require any information beyond what would be required for a nonresilient, nondistributed implementation. It is therefore unnecessary for an applications programmer to have knowledge of the complex protocols normally used to implement fault-tolerant objects. The technique is used in ISIS, a system being developed at Cornell to support resilient objects.
The design of a distributed processing system must include methods to handle distributed data retrieval. A considerable amount of research has been devoted to the development of algorithms that provide this function. ...
详细信息
The design of a distributed processing system must include methods to handle distributed data retrieval. A considerable amount of research has been devoted to the development of algorithms that provide this function. A survey of this research is presented and a taxonomy is introduced that highlights the significant differences among the algorithms.
Many of the special problems in distributed computing relate to the handling of exceptional conditions. In a distributed program exceptions occur as a result of transmission errors and partial failures. Any exceptiona...
详细信息
Many of the special problems in distributed computing relate to the handling of exceptional conditions. In a distributed program exceptions occur as a result of transmission errors and partial failures. Any exceptional condition that arises must be handled if distributed programs are to be robust. Various approaches are examined towards providing exception handling mechanisms for distributed applications which were incorporated into several experimental distributed operating systems. These operating systems.all support the notion that the primary software structuring tool for applications will be a collection of cooperating programs (processes) mapped onto a set of loosely coupled processors.
作者:
Kim, K.H.Univ of South Florida
Dep of Computer Science & Engineering Tampa FL USA Univ of South Florida Dep of Computer Science & Engineering Tampa FL USA
One of the frequently advocated advantages of distributed computing systems.over centralized computing systems.is the improved system reliability potential. Although the application of distributed computing is current...
详细信息
One of the frequently advocated advantages of distributed computing systems.over centralized computing systems.is the improved system reliability potential. Although the application of distributed computing is currently expanding at a rapid rate, the realization of its full reliability potential still requires more fresh solutions and further understanding of many design problems. The nature of some of those design issues are briefly discussed. In order to help preventing misinterpretations while maintaining abstract tones in presentation of research issues, a model of recoverable distributed computing system structure is presented. Discussed are: error detection, hardware and software reconfiguration, the degree of coordinating distributed processes for error detection and recovery;real-time recovery and software engineering tools.
The authors examine the various kinds of distributedsystems.and discuss some of the reliability issues involved. They first concentrate on the causes of unreliability, illustrating these with some general solutions a...
详细信息
ISBN:
(纸本)0818607378
The authors examine the various kinds of distributedsystems.and discuss some of the reliability issues involved. They first concentrate on the causes of unreliability, illustrating these with some general solutions and examples. Among the issues treated are interprocess communication, machine crashes, server redundancy, and data integrity. Then they examine one distributed operating system, Amoeba, to see how reliability issues have been handled in at least one real system, and how the pieces fit together.
作者:
Minoura, ToshimiOregon State Univ
Dep of Computer Science Corvallis OR USA Oregon State Univ Dep of Computer Science Corvallis OR USA
A typical database system maintains target data, which contain information useful for users, and access path data, which facilitate faster accesses to target data. Further, most large databasesystems.support concurre...
详细信息
ISBN:
(纸本)0818605642
A typical database system maintains target data, which contain information useful for users, and access path data, which facilitate faster accesses to target data. Further, most large databasesystems.support concurrent processing of multiple transactions. For a static database system model, where units of concurrency control are not dynamically created or deleted, various concurrency control methods are known. Also, many methods that allow concurrent accesses to indexing structures without invalidating their integrity are known. However, a straightforward integration of these two kinds of concurrency control methods fails because of the phantom problem. The author introduces group locks in order to solve this problem and discusses their implementation. It is shown that if the lowest-level access path data as well as the target data are two-phase locked by transactions, consistency of the logical data will be preserved.
In the fault-tolerant distributed processing systems.some failures may be still considered due to late failure detection and/or to transmission delays. The failures may be caused by both hardware or software. A method...
详细信息
In the fault-tolerant distributed processing systems.some failures may be still considered due to late failure detection and/or to transmission delays. The failures may be caused by both hardware or software. A method is introduced that bufferizes the information before using it and determines where the information may be used. The operation of a telephone system is used to illustrate this method used in duplicate data recovery.
A method for testing, debugging, and measuring distributedsystems.is described. The test method accompanies the implementation as well as the operation of distributedsystems. During implementation, the test tools al...
详细信息
ISBN:
(纸本)0818607378
A method for testing, debugging, and measuring distributedsystems.is described. The test method accompanies the implementation as well as the operation of distributedsystems. During implementation, the test tools allow users to monitor and control the tested system at different problem-oriented levels. The immense amount of information is graphically displayed in easy-to-read charts and graphs. During operation, the test system permanently monitors systems.behavior and measures system performance. The author views performance and analysis measurements during operation as an integral part of the system. The test method and tools promote an improved understanding of run-time behavior and possibly of functional requirements of distributedsystems. They provide performance measurements to derive qualitative and even quantitative assessments about distributedsystems.
暂无评论