An analytic model for estimating the task response in loosely coupled distributedsystems.is introduced. The model considers such factors as the precedence relationships among software modules, interprocessor communic...
详细信息
ISBN:
(纸本)0818605669
An analytic model for estimating the task response in loosely coupled distributedsystems.is introduced. The model considers such factors as the precedence relationships among software modules, interprocessor communication, interconnection network delay, module scheduling policy, and assignment of modules to computers. Simulation experiments are used to validate the assumptions of the analytic model. Applications of the model to the study of design issues for distributedsystems.such as module assignment, precedence relationships, module scheduling policies, and database management algorithms are discussed.
A new reliability model is introduced for selecting the best software fault-tolerant (FT) design. This model uses a task graph technique that allows different candidate FT configurations to be analyzed based on the st...
详细信息
ROSE, a modular distributed operating system that provides support for building reliable applications, is designed and implemented. Failure detection capabilities are provided by a failure detection server. Configurat...
详细信息
ROSE, a modular distributed operating system that provides support for building reliable applications, is designed and implemented. Failure detection capabilities are provided by a failure detection server. Configuration objects can be used to capture the relationship among multiple processes that cooperate to replicate certain resources. Replicated address space (RAS) objects, whose content is accessible with a high probability despite hardware failures, can be used to increase data availability. Finally, a resistant process (RP) abstraction allows user processes to survive hardware failures with minimal interruption. Two different implementations of RP are provided: one checkpoints the information about its state in an RAS object periodically;the other uses replicated execution by executing the same code in different nodes at the same time.
With the growing dependence on distributedsystems.technology throughout the business world, it became clear that issues regarding scalability, reliability, and maintainability would improve the need for systems.depen...
详细信息
With the growing dependence on distributedsystems.technology throughout the business world, it became clear that issues regarding scalability, reliability, and maintainability would improve the need for systems.dependability and ease of operation. This paper introduces the implementation and use of the industry standard Simple Network Management Protocol (SNMP) and its related Management Information Base (MIB) into the NFAC Parametric Realtime Information Management Enterprise, NPRIME, software architecture and how its used to manage a deployed system for a wind-tunnel test. The paper will describe the necessary SNMP running agents and view objects used to update near real-time graphical displays developed through MicroSoft's visual C++ under Windows 95 and NT. In addition, the paper presents how application level programmers can repartition their software under NPRIME by redistributing their executables within a heterogeneous distributed environment after editing an ASCII system configuration specification file among a variety of systems. This would allow users to redistribute software load among multiple systems.without the need for expensive code changes. This design has the inherent advantage to ensure reliability for handling peak loads gracefully and with ease;and introduce fault tolerance and recovery as a result of multiplicity of resources.
Three algorithms designed to enforce different quality of service criteria are presented, as well as empirical assessments of the algorithms for three large industrial telecommunications systems. These assessments are...
详细信息
Three algorithms designed to enforce different quality of service criteria are presented, as well as empirical assessments of the algorithms for three large industrial telecommunications systems. These assessments are made in terms of the simulated performance of each system on average loads selected from operational distributions collected during beta release and field use. In addition, synthetic heavy loads designed to cause the overall CPU utilization rates to exceed 90% of capacity were run. The algorithms build on previously defined load testing algorithms, and use parameters and operational distributions computed for that purpose. This makes the quality of service enforcement algorithms particularly efficient. The primary bases for the assessment of the algorithms were the overall deviation of the response time from the average, and the fraction of service requests that were throttled from clients under varying conditions.
This paper outlines a human-centered virtual machine of problem solving agents, intelligent agents, software agents and objects. It deals with issues related to high-assurance (e.g. reliability, availability real-time...
详细信息
ISBN:
(纸本)0818692219
This paper outlines a human-centered virtual machine of problem solving agents, intelligent agents, software agents and objects. It deals with issues related to high-assurance (e.g. reliability, availability real-time and others) through design of human-centered system architecture in which technology is a primitive. The human-centered virtual machine is based on a number of human-centered perspectives including the distributed cognition approach. The human-centered virtual machine has been applied in complex data intensive time critical problems like real-time alarm processing and fault diagnosis, air combat simulation and business (decision support).
The authors describe the overall system design for ImageNet and present a system prototype developed on an Ethernet network in the Computer Engineering Research Laboratory at the University of Arizona. ImageNet is a g...
详细信息
Recently, the wireless networking community is getting more and more interested in novel protocol designs for safety-critical applications. These new applications come with unprecedented latency and reliability constr...
详细信息
ISBN:
(纸本)9781509035137
Recently, the wireless networking community is getting more and more interested in novel protocol designs for safety-critical applications. These new applications come with unprecedented latency and reliability constraints which poses many open challenges. A particularly important one relates to the question how to develop such systems. Traditionally, development of wireless systems.has mainly relied on simulations to identify viable architectures. However, in this case the drawbacks of simulations - in particular increasing run-times - rule out its application. Instead, in this paper we propose to use probabilistic model checking, a formal model-based verification technique, to evaluate different system variants during the design phase. Apart from allowing evaluations and therefore design iterations with much smaller periods, probabilistic model checking provides bounds on the reliability of the considered design choices. We demonstrate these salient features with respect to the novel EchoRing protocol, which is a token-based system designed for safety-critical industrial applications. Several mechanisms for dealing with a token loss are modeled and evaluated through probabilistic model checking, showing its potential as suitable evaluation tool for such novel wireless protocols. In particular, we show by probabilistic model checking that wireless token-passing systems.can benefit tremendously from the considered fault-tolerant methods. The obtained performance guarantees for the different mechanisms even provide reasonable bounds for experimental results obtained from a real-world implementation.
The proceedings contain 39 papers. The topics discussed include: detection of unexpected situations by applying softwarereliability growth models to test phases;resource/schedule/content model: improving testing effe...
ISBN:
(纸本)9781509019441
The proceedings contain 39 papers. The topics discussed include: detection of unexpected situations by applying softwarereliability growth models to test phases;resource/schedule/content model: improving testing effectiveness;static analysis of physical properties in Simulink models;test suites for benchmarks of static analysis tools;optimizing resiliency of distributed video surveillance system for safer city;software-defined networking (SDN) control message classification, verification, and optimization system;integrating formal methods with testing for reliability estimation of component based systems.C-SEC (Cyber SCADA evaluation capability): securing critical infrastructures;operational softwarized networks reliability management;knowledge transition: discovering workflow models from functional tests;and analyzing failure mechanism for complex software-intensive systems.
software Transactional Memories (STMs) are emerging as a highly attractive programming model, thanks to their ability to mask concurrency management issues to the overlying applications. In this paper we are intereste...
详细信息
暂无评论