A controversial point in designing a distributed system is whether the user or the system should be responsible for taking actions as a consequence of system failures. The author proposes a dynamic configuration schem...
详细信息
ISBN:
(纸本)0818606908
A controversial point in designing a distributed system is whether the user or the system should be responsible for taking actions as a consequence of system failures. The author proposes a dynamic configuration scheme for runtime reconfiguration of application software that is more flexible than existing proposals. A description is given of a reconfigurable scheme implemented by the operating system which only requires that software components be virtually connected. The authors demonstrate that the scheme increases the reliability and availability of distributedsystems. and compare and contrast this scheme with other similar proposals.
This research focuses on testing enterprise systems. more concretely on how to automatically generate the initial test data to be entered into the relational database to support each test case. Existing approaches can...
详细信息
ISBN:
(纸本)9780769549286;9781467350488
This research focuses on testing enterprise systems. more concretely on how to automatically generate the initial test data to be entered into the relational database to support each test case. Existing approaches cannot generate initial database entries to suit complicated business logic states such as reading the database more than once, searching the database by partial string matching, or setting primary and foreign key constraints on the database scheme. To solve these limitations, we propose a method for initial database generation. This method adopts a design model that can handle the complicated business logic states given above, and from this design model, our method generates appropriate initial database entries;it employs a step-by-step approach using the constraints extracted from the design model. The proposed method enables us to automatically generate initial database entries for a wide range of test cases and thus supports the testing of industrial-level enterprise systems. Using three industrial-level enterprise systems.as case studies, we confirm that our method properly generated initial databases for 72% to 100% of the test cases in which an initial database was needed.
software-implemented fault masking in distributedsystems.requires the generation of at least three copies of all processes and the insertion of majority voters at each interprocess communication between process tripl...
详细信息
ISBN:
(纸本)0818607378
software-implemented fault masking in distributedsystems.requires the generation of at least three copies of all processes and the insertion of majority voters at each interprocess communication between process triples. If the semantics of the receive operator used by the receiver triple indicates waiting for the receipt of messages coming from different sender triples in an indeterministic order, different sequences of message processing by the processes of the receiver triple have to be avoided by execution of an agreement protocol. A protocol is presented to solve the problem of both fault masking and sequence agreement simultaneously, in order to reduce communication overhead with respect to message number as well as message length. The concept of fault masking is a slight modification of an m-protocol that supports sequence agreement by the generation of at least two sender-specific encoded signatures and by execution of acknowledgement and sequence agreement jointly using the same messages. A general classification of voting problems shows that sequence agreement does not require usual protocols for interactive consistency, even in the case of Byzantine faults. This permits a simple fault-detecting centralized solution for sequence agreement.
Anomaly detection in distributedsystems.has been a fertile research area, and a range of anomaly detectors have been proposed for distributedsystems. Unfortunately, there is no systematic quantitative study of the e...
详细信息
ISBN:
(纸本)9781728198705
Anomaly detection in distributedsystems.has been a fertile research area, and a range of anomaly detectors have been proposed for distributedsystems. Unfortunately, there is no systematic quantitative study of the efficacy of different anomaly detectors, which is of great importance to reveal the deficiencies of existing anomaly detectors and shed light on future research directions. In this paper, we investigate how various anomaly detectors behave on anomalies of different types and the reasons for the same, by extensively injecting software faults into three widely-used distributedsystems. We use a statement-level fault injection method to observe the anomalies, characterize these anomalies, and analyze the detection results from anomaly detectors of three categories. We find that: (1) the distributedsystems. own error reporting mechanisms are able to report most of the anomalies (from 82.1% to 92.8%) but they incur a high false alarm rate of 26.6%. (2) State-of-the-art anomaly detectors are able to detect the existence of anomalies with 99.08% precision and 90.60% recall, but there is still a long way to go to pinpoint the accurate location of the detected anomalies, and (3) Log-based anomaly detection techniques outperform other anomaly detection techniques, but not for all anomaly types.
This paper is to present a systematic problem solving approach, which is based on the Failure Modes and Effects Analysis (FMEA), to system softwarereliability. This approach will practically: (a) Ensure that all of c...
详细信息
ISBN:
(纸本)0780366158
This paper is to present a systematic problem solving approach, which is based on the Failure Modes and Effects Analysis (FMEA), to system softwarereliability. This approach will practically: (a) Ensure that all of conceivable failure modes and their effects on operational success of the software system have been considered. (b) List potential failures, and identify the magnitude of their effects. (c) Develop criteria for test planning, design of the tests, and checkout systems.(e.g., logging mechanism). (d) Provide a basis for quantitative reliability and availability analysis. (e) Provide a basis for establishing corrective action priorities. This approach was created for softwarereliability analysis and testing in the Multimedia Digital Distribution System (MDDS) at Thomson-CSF Sextant In-Flight systems. First it was used to improve the softwarereliability for the Communication Control Unit (CCU) subsystem of the MDDS, and then globally applied to the softwarereliability analysis and improvement for the whole MDDS. It has been proven to be an effective and efficient approach to system softwarereliability.
N-modular Redundancy (NMR) protects against arbitrary types of hardware or software failures in a minority of system components, thereby yielding the highest degree of reliability. A study is made of the application o...
详细信息
ISBN:
(纸本)0818606908
N-modular Redundancy (NMR) protects against arbitrary types of hardware or software failures in a minority of system components, thereby yielding the highest degree of reliability. A study is made of the application of NMR, specifically triple modular redundancy (TMR), to general-purpose database processing. The authors discuss the structure and implementation tradeoffs of a TMR system that is 'synchronized' at the transaction level, i. e. , in which complete transactions are distributed to all nodes, where they are processed independently, and only the majority output is accepted. The inherent cost of such a TMR database system is examined on the basis of preliminary performance results from a version implemented on three SUN-2/120 workstations.
software FMEA is a means to determine whether any single failure in computer software can cause catastrophic system effects, and additionally identifies other possible consequences of unexpected software behavior. The...
详细信息
ISBN:
(纸本)0780377176
software FMEA is a means to determine whether any single failure in computer software can cause catastrophic system effects, and additionally identifies other possible consequences of unexpected software behavior. The procedure described here was developed and used to analyze mission- and safety-critical softwaresystems. The procedure includes using a structured approach to understanding the subject software, developing rules and tools for doing the analysis as a group effort with minimal data entry and human error, and generating a final report. software FMEA is a kind of implementation analysis that is an intrinsically tedious process but database tools make the process reasonably painless, highly accurate, and very thorough. The main focus here is on development and use of these database tools.
The authors deal with the task allocation problem in distributedsoftware design, with the goal of maximizing the system reliability. A quantitative problem model, algorithms for optimal and suboptimal solutions, and ...
详细信息
ISBN:
(纸本)0818608757
The authors deal with the task allocation problem in distributedsoftware design, with the goal of maximizing the system reliability. A quantitative problem model, algorithms for optimal and suboptimal solutions, and simulation results are provided and discussed. Because the authors use a new allocation goal--to maximize system reliability--this study complements the existing body of knowledge in task allocation.
The proceeding contains 21 papers. The following topics are dealt with: recovery in distributedsystems.managing replication and network partition;fault-tolerance techniques;fault-tolerant protocols;voting and fault d...
详细信息
ISBN:
(纸本)0818608757
The proceeding contains 21 papers. The following topics are dealt with: recovery in distributedsystems.managing replication and network partition;fault-tolerance techniques;fault-tolerant protocols;voting and fault diagnosis;experimental systems.and, consistency maintenance.
Fault-tolerant distributed algorithms that are designed to reach agreement have been the subject of a great deal of recent study, primarily focussed on the Byzantine agreement paradigm. The author explores new paradig...
详细信息
ISBN:
(纸本)0818606908
Fault-tolerant distributed algorithms that are designed to reach agreement have been the subject of a great deal of recent study, primarily focussed on the Byzantine agreement paradigm. The author explores new paradigms and problems that arise in the context of maintaining agreement, rather than reaching agreement in an isolated instance. The emphasis is on open problem areas rather than on specific solutions.
暂无评论