This paper presents a software modeling environment for estimating the performance of distributeddatabasesystems. This tool supports a simulation language, HGPSS, which comprises various simulation primitives, conta...
详细信息
ISBN:
(纸本)0818619465
This paper presents a software modeling environment for estimating the performance of distributeddatabasesystems. This tool supports a simulation language, HGPSS, which comprises various simulation primitives, contains a collection of network modules, and allows for the collection of statistics. This provides an overview of the HGPSS environment emphasizing its applicability to the modeling of distributeddatabases.
The authors present an election protocol that does not assume an underlying ring structure and that tolerates failures, including lost messages and network partitioning, during the execution of the protocol itself. Th...
详细信息
ISBN:
(纸本)0818608757
The authors present an election protocol that does not assume an underlying ring structure and that tolerates failures, including lost messages and network partitioning, during the execution of the protocol itself. The major problem to be solved is that when nodes cannot communicate with one another or messages are lost, a conflict in resolving the election will often arise. In the authors' approach, the conflict is detected by the cohorts (noncandidate participants in the election). Related election protocols are discussed, and the system model is described together with assumptions about the communication subsystem. The protocol and the lost-message situations are then examined.
The symposium Materials contain 21 papers. The following topics are dealt with: checkpointing and logging algorithms;backward recovery schemes;replication and parallelism;dependability modeling and assessment;agreemen...
详细信息
ISBN:
(纸本)0818622601
The symposium Materials contain 21 papers. The following topics are dealt with: checkpointing and logging algorithms;backward recovery schemes;replication and parallelism;dependability modeling and assessment;agreement;and garbage collection.
Multicast communication in a distributed system connected by a local area network can increase parallelism, and it can also provide a greater functionality than one-to-one communication. In the authors' multicast ...
详细信息
Multicast communication in a distributed system connected by a local area network can increase parallelism, and it can also provide a greater functionality than one-to-one communication. In the authors' multicast protocol, the sender directs a message to a named group of receivers, which can be specified by function without requiring the sender to know the specific members of the group. Each host's kernel in the network can respond to every group message sent, providing various levels of reliability. It was found that the overhead of providing dependable multicast over a single local area network was very small, mainly because the protocol operates at the kernel level rather than the user level. Several forms of this multicast communication, expressed as simple message-passing communication primitives, are described, and the effectiveness of the protocol is evaluated using an example of a distributed algorithm. Performance analyses and actual performance data for the protocol are presented.
The authors examine optimal task allocation for redundant, heterogeneous distributed computer systems. It is assumed that the systems.under consideration are required for execution of long-term mission applications su...
详细信息
ISBN:
(纸本)0818608730
The authors examine optimal task allocation for redundant, heterogeneous distributed computer systems. It is assumed that the systems.under consideration are required for execution of long-term mission applications such as space flights. A formal description of the problem is given and formal, quantitative task allocation models are derived. Both an optimal allocation algorithm and an approximating optimal algorithm are derived, discussed, and compared by simulation results. For the latter case, a formula for computing the error associated with the approximation used is also presented.
In cost conscious industries, such as automotive, it is imperative for designers to adhere to policies that reduce system resources to the extent feasible, even for safety-critical sub-systems. However, the overall re...
详细信息
ISBN:
(纸本)3540410554
In cost conscious industries, such as automotive, it is imperative for designers to adhere to policies that reduce system resources to the extent feasible, even for safety-critical sub-systems. However, the overall reliability requirement, typically in the order of 10(-9) faults/hour, must be both analysable and met. Faults can be hardware, software or timing faults. The latter being handled by hard-real time schedulability analysis, which is used to prove that no timing violations will occur. However, from a reliability and cost perspective there is a tradeoff between timing guarantees, the level of hardware and software faults, and the per-unit cost for meeting the overall reliability requirement. This paper outlines a reliability analysis method that considers the effect of faults on schedulability analysis and its impact on the reliability estimation of the system. The ideas have general applicability, but the method has been developed with modeling of external interferences of automotive CAN buses in mind. We illustrate the method using the example of a distributed braking system.
The following topics are dealt with: real-time distributed programming systems.architecture and interconnection schemes;fault tolerance;reliability estimation and performance modeling;performance analysis;intercommuni...
详细信息
ISBN:
(纸本)0818607491
The following topics are dealt with: real-time distributed programming systems.architecture and interconnection schemes;fault tolerance;reliability estimation and performance modeling;performance analysis;intercommunication protocols;operative systems.dynamic and distributed scheduling;task allocation and load balancing;real-time operating system for nuclear power plant computer;real-time juggling robot;and real-time direct kinematics on a VLSI chip. 30 papers were presented, all of which are published in full in the present proceedings. Abstracts of individual papers can be found under the classification codes in this or other issues.
Advances in field reconfigurable technology have made possible the design and implementation of highly flexible parallel multi-processor-memory systems.system reliability is often an important measure of these systems...
Characterizing latent software faults is crucial to address dependability issues of current three-tier systems. A client should not have a misconception that a transaction succeeded, when in reality, it failed due to ...
详细信息
ISBN:
(纸本)9780769544502
Characterizing latent software faults is crucial to address dependability issues of current three-tier systems. A client should not have a misconception that a transaction succeeded, when in reality, it failed due to a silent error. We present a fault injection-based evaluation to characterize silent and non-silent software failures in a representative three-tier web service, one that mimics a day trading application widely used for benchmarking application servers. For failure characterization, we quantify distribution of silent and non-silent failures, and recommend low cost application-generic and application-specific consistency checks, which improve the reliability of the application. We inject three variants of null-call, where a callee returns null to the caller without executing business logic. Additionally, we inject three types of unchecked exceptions and analyze the reaction of our application. Our results show that 49% of error injections from null-calls result in silent failures, while 34% of unchecked exceptions result in silent failures. Our generic-consistency check can detect silent failures in null-calls with an accuracy as high as 100%. Non-silent failures with unchecked exceptions can be detected with an accuracy of 42% with our application-specific checks.
MEADEP is a user-friendly dependability evaluation tool for measurement-based analysis of computing systems.including both hardware and software. Features of MEADEP are: a data processor for converting data in various...
详细信息
MEADEP is a user-friendly dependability evaluation tool for measurement-based analysis of computing systems.including both hardware and software. Features of MEADEP are: a data processor for converting data in various formats (records with a number of fields stored in a commercial database format) to the MEADEP format, a statistical analysis module for graphical data presentation and parameter estimation, a graphical modeling interface for constructing reliability block and Markov diagrams, and a model solution module for availability/reliability calculation with graphical parametric analysis. Use of the tool on failure data from measurements can provide quantitative assessments of dependability for critical systems. while greatly reducing requirements for specialized skills in data processing, analysis, and modeling from the user. MEADEP has been applied to evaluate dependability for several air traffic control systems.(ATC) and results produced by MEADEP have provided valuable feedback to the program management of these critical systems.
暂无评论