In the fault-tolerant distributed processing systems.some failures may be still considered due to late failure detection and/or to transmission delays. The failures may be caused by both hardware or software. A method...
详细信息
In the fault-tolerant distributed processing systems.some failures may be still considered due to late failure detection and/or to transmission delays. The failures may be caused by both hardware or software. A method is introduced that bufferizes the information before using it and determines where the information may be used. The operation of a telephone system is used to illustrate this method used in duplicate data recovery.
作者:
Kim, K.H.Univ of South Florida
Dep of Computer Science & Engineering Tampa FL USA Univ of South Florida Dep of Computer Science & Engineering Tampa FL USA
One of the frequently advocated advantages of distributed computing systems.over centralized computing systems.is the improved system reliability potential. Although the application of distributed computing is current...
详细信息
One of the frequently advocated advantages of distributed computing systems.over centralized computing systems.is the improved system reliability potential. Although the application of distributed computing is currently expanding at a rapid rate, the realization of its full reliability potential still requires more fresh solutions and further understanding of many design problems. The nature of some of those design issues are briefly discussed. In order to help preventing misinterpretations while maintaining abstract tones in presentation of research issues, a model of recoverable distributed computing system structure is presented. Discussed are: error detection, hardware and software reconfiguration, the degree of coordinating distributed processes for error detection and recovery;real-time recovery and software engineering tools.
The initial design of three modules of DDTS (distributeddatabase Testbed System) is presented. The DDTS emphasizes modularity and independence of modules so that it may be used to experimentally study the effects of ...
详细信息
The initial design of three modules of DDTS (distributeddatabase Testbed System) is presented. The DDTS emphasizes modularity and independence of modules so that it may be used to experimentally study the effects of different algorithms at each module. DDTS architecture and transactions are considered, with special attention to information architecture (IA) and system architecture (SA).
作者:
Seifert, Manfred H.IBM Germany
Heidelberg Scientific Cent Heidelberg West Ger IBM Germany Heidelberg Scientific Cent Heidelberg West Ger
A characteristic feature of the dynamic structure of the distributedsoftware is, that management functions as well as application functions are carried out by parallel and interacting processes. A set of such interac...
详细信息
A characteristic feature of the dynamic structure of the distributedsoftware is, that management functions as well as application functions are carried out by parallel and interacting processes. A set of such interacting and belonging together processes is called a distributed process system. The application and system programs are defined and the structural-redundant process is explained including functional redundancy. User and manager process systems.are also considered in the architecture of fault-tolerant software.
作者:
Segall, ZaryCarnegie-Mellon Univ
Computer Science Dep Pittsburgh PA USA Carnegie-Mellon Univ Computer Science Dep Pittsburgh PA USA
The ultimate test of the efficiency of mechanisms and policies employed to achieve increased performance and/or reliability in a distributed system, is provided by the evaluation of measurements taken from the real sy...
详细信息
The ultimate test of the efficiency of mechanisms and policies employed to achieve increased performance and/or reliability in a distributed system, is provided by the evaluation of measurements taken from the real system. Experimentation with multiprocessor is considered. The concept of an Integrated Instrumentation Environment (IIE) is introduced as a structured approach to facilitate the process of experimentation. The design presented emphasizes the integration of instrumentation tools such as stimulus generation and monitoring into a unified experiment management environment. An experiment schema is introduced as an appropriate structuring concept for experiment management purposes. Schema instances are introduced to capture the results of an experiment for later analysis.
The concept of atomic transactions has been used to provide reliable processing in both centralized and distributedsystems. An extension of traditional atomic transactions is presented: nested transactions. Nested tr...
详细信息
The concept of atomic transactions has been used to provide reliable processing in both centralized and distributedsystems. An extension of traditional atomic transactions is presented: nested transactions. Nested transactions are seen to permit safe concurrency within as well as among transactions, and to enable transactions to fail partially in a graceful and controlled manner. These properties of nested transactions suit them to a number of distributed applications. Examples of such applications are described.
Task and file allocation are examined in two classes of fault-tolerant distributedsystems. The task allocation problem arises in software-implemented fault tolerance (SIFT)-like systems. while the file allocation pro...
详细信息
Task and file allocation are examined in two classes of fault-tolerant distributedsystems. The task allocation problem arises in software-implemented fault tolerance (SIFT)-like systems. while the file allocation problem arises in Ethernet-like systems. Both problems may be formulated as a constrained sum of squares minimization problem. The computational complexity of these problems prompts us to consider an efficient approximation algorithm that does not always yield optimal answers. It is shown that the ratio of the approximate to the optimal solution is bounded by 9m/8(m minus r plus 1), where m is the number of processors (file servers) to be allocated and r is the number of times each task (file) is to be replicated. Experience with the algorithm suggests that ever better performance ratios can be expected.
An implementation of a reliable remote procedure call (RPC) mechanism for obtaining remote services is described. The reliability issues are discussed together with how they have been dealt with. The performance of th...
详细信息
An implementation of a reliable remote procedure call (RPC) mechanism for obtaining remote services is described. The reliability issues are discussed together with how they have been dealt with. The performance of the remote call mechanism is compared with that of local calls. The remote call mechanism is shown to be an efficient tool for distributed programming.
The quantification of distributed Data Base (DDB) reliability is needed for the DDB design phase and for the comparative evaluation of the effectiveness of various data distribution strategies. A method based on a Mar...
详细信息
The quantification of distributed Data Base (DDB) reliability is needed for the DDB design phase and for the comparative evaluation of the effectiveness of various data distribution strategies. A method based on a Markov model is outlined to evaluate the transition probabilities from a functioning system to a faulty one and viceversa. The method consists of two steps: first is the computation of the transition rate matrix of the DDB with Kronecker algebra, which is used then to calculate the probabilities of the different possible states of the DDB;the second is an algorithm to calculate the structure vector related to a given transaction of the DDB.
作者:
Lin, James J.Liu, Ming T.Ohio State Univ
Dep of Computer & Information Science Columbus OH USA Ohio State Univ Dep of Computer & Information Science Columbus OH USA
The system design and performance evaluation of a local data network for very large distributeddatabases. The growing database problem stimulates the need of hardware support for data management in distributed system...
详细信息
The system design and performance evaluation of a local data network for very large distributeddatabases. The growing database problem stimulates the need of hardware support for data management in distributedsystems. A novel hardware configuration, the distributed Double-Loop Data Network (DDLDN), is exemplified. Concurrency control mechanisms and query processing techniques used in the DDLDN are described. Optimal strategy for disk allocation is selected. A performance comparison is made for different types of systems.under various conditions, showing superior performance of the DDLDN. Finally, a way to cope with potential growth of the system is demonstrated.
暂无评论