Creating robust software requires not only careful specification and implementation, but also quantitative measurement. This paper describes Ballista exception handling testing of the High Level Architecture RunTime I...
详细信息
Creating robust software requires not only careful specification and implementation, but also quantitative measurement. This paper describes Ballista exception handling testing of the High Level Architecture RunTime Infrastructure (HLA RTI). The RTI is a standard distributed simulation system intended to provide completely robust exception handling, yet implementations have normalized robustness failure rates as high as 10%. Non-robust testing responses include exception handler crashes, segmentation violations, "unknown" exceptions, and task hangs. Other issues include different robustness failure modes across ports to two operating systems. and mandatory client machine rebooting after a particular RTl failure. Testing the RTI led to scalable extensions of the Ballista architecture for handling exception-based error reporting models, testing object-oriented software structures (including call-backs, pass by reference, and constructors), and operating in a state-rich, distributed system environment. These results demonstrate that robustness testing can provide useful feedback to high-quality software development processes, and can be applied to domains well beyond the previous work on testing operating systems.
This paper proposes to use a logical hypercube structure for detecting message stability in distributedsystems. In particular, a stability detection protocol that uses such a superimposed logical structure is present...
详细信息
This paper proposes to use a logical hypercube structure for detecting message stability in distributedsystems. In particular, a stability detection protocol that uses such a superimposed logical structure is presented, and its scalability is compared with other known stability detection protocols. The main benefits of the logical hypercube approach are scalability, fault-tolerance, and refraining from overloading a single node or link in the system. These benefits become evident both by an analytical comparison and by simulations. Another important feature of the logical hypercube approach is that the performance of the protocol is in general not sensitive to the topology of the underlying physical network.
The scalability and reliability of secondary storage systems.are their most significant aspects for advanced database applications. Research on high-function disks has recently attracted a great deal of attention beca...
详细信息
The scalability and reliability of secondary storage systems.are their most significant aspects for advanced database applications. Research on high-function disks has recently attracted a great deal of attention because technological progress now allows disk-resident data processing. This capability is not only useful for executing application programs on the disk, but is also suited for controlling distributed disks so they are scalable and reliable. We propose autonomous disks in the network environment by using the disk-resident data processing facility. A set of autonomous disks is configured as a cluster in a network, and data is distributed within the cluster, to be accessed uniformly by using a distributed directory. The disks accept simultaneous accesses from multiple hosts via a network, and handle data distribution and load skews. They are also able to tolerate disk failures and some software errors of disk controllers, and can reconfigure the cluster after the damaged disks are repaired. The data distribution, skew handling, and fault tolerance are completely transparent to hosts. The local communication means the size of the cluster is scalable. Autonomous disks are applicable to many advanced applications, such as a large Web server having many HTML files. We also propose to use rules to implement these functions, and we demonstrate their flexibility by examples of rules.
Mobile agents, programs that move within a system performing a set of tasks, are an active field of research. The focus of current research, however, is on the development of execution platforms and applications for m...
详细信息
Mobile agents, programs that move within a system performing a set of tasks, are an active field of research. The focus of current research, however, is on the development of execution platforms and applications for mobile agents and not on methodologies for building agents. Creating mobile agents can be tedious and susceptible to errors. We propose a framework where the agent is composed using a well-defined set of categories of software components. Building systems.from software components has already proven useful in the context of large softwaresystems. increasing the productivity of the development process and the reliability of the measuring system by reusing proven components. We claim that the same holds true for the construction of mobile agents for network and systems.management as well as for other domains. We have designed and implemented an agent construction toolkit (the AgentBean Development Kit-ADK) to demonstrate the usability and flexibility of this approach.
The role of multimedia applications in our day-to-day life has dramatically increased in the last few years. In this paper, we present an architectural framework for the distributed multimedia systems. The architectur...
详细信息
The role of multimedia applications in our day-to-day life has dramatically increased in the last few years. In this paper, we present an architectural framework for the distributed multimedia systems. The architecture consists of three layers: the application layer, the configuration and synchronization layer, and the network layer. The three layers are backed up by two backbone layers, namely, the database and the computational backbone. We present a precise description of each layer together with formal specifications using finite state automata.
Existing IEEE softwarereliability standards do not address the characteristics of distributedsystems. including client-server systems. Furthermore, these standards were issued before the widespread application of CO...
详细信息
Existing IEEE softwarereliability standards do not address the characteristics of distributedsystems. including client-server systems. Furthermore, these standards were issued before the widespread application of COTS and safety-critical systems. In addition, these standards do not take into account the influence on reliability of such process improvement measures as inspections, reuse, and object-oriented design paradigms. Lastly, these standards do not consider both hardware and softwarereliability nor do they include availability and maintainability. To be of value, the next generation of dependability standards must address these deficiencies. With the active participation of the audience, the panel will identify and debate the future direction of dependability standards.
The various softwaresystems.developed for the DIII-D tokamak have played a highly visible and important role in tokamak operations and fusion research. Because of the heavy reliance on in-house developed software enc...
详细信息
The various softwaresystems.developed for the DIII-D tokamak have played a highly visible and important role in tokamak operations and fusion research. Because of the heavy reliance on in-house developed software encompassing all aspects of operating the tokamak, much attention has been given to the careful design, development and maintenance of these softwaresystems.softwaresystems.responsible for tokamak control and monitoring, neutral beam injection, and data acquisition demand the highest level of reliability during plasma operations. These systems.made up of hundreds of programs totaling thousands of lines of code have presented a wide variety of software design and development issues ranging from low level hardware communications, database management, and distributed process control, to man machine interfaces. The focus of this paper will be to describe how software is developed and managed for the DIII-D control and data acquisition computers. It will include an overview and status of softwaresystems.implemented for tokamak control, neutral beam control, and data acquisition. The issues and challenges faced developing and managing the large amounts of software in support of the dynamic and everchanging needs of the DIII-D experimental program will be addressed.
MEADEP is a user-friendly dependability evaluation tool for measurement-based analysis of computing systems.including both hardware and software. Features of MEADEP are: a data processor for converting data in various...
详细信息
MEADEP is a user-friendly dependability evaluation tool for measurement-based analysis of computing systems.including both hardware and software. Features of MEADEP are: a data processor for converting data in various formats (records with a number of fields stored in a commercial database format) to the MEADEP format, a statistical analysis module for graphical data presentation and parameter estimation, a graphical modeling interface for constructing reliability block and Markov diagrams, and a model solution module for availability/reliability calculation with graphical parametric analysis. Use of the tool on failure data from measurements can provide quantitative assessments of dependability for critical systems. while greatly reducing requirements for specialized skills in data processing, analysis, and modeling from the user. MEADEP has been applied to evaluate dependability for several air traffic control systems.(ATC) and results produced by MEADEP have provided valuable feedback to the program management of these critical systems.
With the growing dependence on distributedsystems.technology throughout the business world, it became clear that issues regarding scalability, reliability, and maintainability would improve the need for systems.depen...
详细信息
With the growing dependence on distributedsystems.technology throughout the business world, it became clear that issues regarding scalability, reliability, and maintainability would improve the need for systems.dependability and ease of operation. This paper introduces the implementation and use of the industry standard Simple Network Management Protocol (SNMP) and its related Management Information Base (MIB) into the NFAC Parametric Realtime Information Management Enterprise, NPRIME, software architecture and how its used to manage a deployed system for a wind-tunnel test. The paper will describe the necessary SNMP running agents and view objects used to update near real-time graphical displays developed through MicroSoft's visual C++ under Windows 95 and NT. In addition, the paper presents how application level programmers can repartition their software under NPRIME by redistributing their executables within a heterogeneous distributed environment after editing an ASCII system configuration specification file among a variety of systems. This would allow users to redistribute software load among multiple systems.without the need for expensive code changes. This design has the inherent advantage to ensure reliability for handling peak loads gracefully and with ease;and introduce fault tolerance and recovery as a result of multiplicity of resources.
This paper outlines a human-centered virtual machine of problem solving agents, intelligent agents, software agents and objects. It deals with issues related to high-assurance (e.g. reliability, availability real-time...
详细信息
ISBN:
(纸本)0818692219
This paper outlines a human-centered virtual machine of problem solving agents, intelligent agents, software agents and objects. It deals with issues related to high-assurance (e.g. reliability, availability real-time and others) through design of human-centered system architecture in which technology is a primitive. The human-centered virtual machine is based on a number of human-centered perspectives including the distributed cognition approach. The human-centered virtual machine has been applied in complex data intensive time critical problems like real-time alarm processing and fault diagnosis, air combat simulation and business (decision support).
暂无评论