A closed queueing network model is constructed to address workload effects on computer performance for a highly reliable unibus multiprocessor used in real-time control. The model consists of multiserver nodes and a n...
详细信息
A closed queueing network model is constructed to address workload effects on computer performance for a highly reliable unibus multiprocessor used in real-time control. The model consists of multiserver nodes and a nonpreemptive priority queue. Use of this model requires partitioning the workload into task classes. The time-average steady-state solution of the queuing model directly produces u
A simple algorithm for broadcasting in a hypercube multicomputer containing faulty nodes/links is proposed. The algorithm delivers multiple copies of the broadcast message through disjoint paths to all the modes in th...
详细信息
A simple algorithm for broadcasting in a hypercube multicomputer containing faulty nodes/links is proposed. The algorithm delivers multiple copies of the broadcast message through disjoint paths to all the modes in the system. Its salient feature is that the delivery of the multiple copies is transparent to the processes receiving the message and does not require the processes to know the ident
The problem of allocation and release of subcubes from a hypercube with node failures is addressed. Two algorithms are presented, both based on the Buddy allocation scheme for memory management which is also used by t...
详细信息
This paper describes an embedding of Triple Modular Redundancy (TMR) into a binary hypercube. The goal is to improve fault tolerance by masking any single-point faults. Each module of an application task is triplicate...
详细信息
ISBN:
(纸本)0897912780
This paper describes an embedding of Triple Modular Redundancy (TMR) into a binary hypercube. The goal is to improve fault tolerance by masking any single-point faults. Each module of an application task is triplicated and executed in parallel on three nodes of a 2-dimensional subcube (Q2) of the hypercube. Each of these nodes also executes a voter process. The remaining node is used for message passing only. All outputs from the triplicated modules are voted on, and the voting results are transmitted to the appropriate destination. Thus, all interunit messages are also triplicated. We propose an embedding of TMR into a hypercube which can be implemented in a manner transparent to the application program. Subcubes are allocated so that the address space for the TMR units is also a hypercube. Hence, the subcube allocation and intermodule communication schemes are defined to be analogous to the schemes used in the nonre-dundant system. The embedded system is proven to mask all single-point faults.
A connected hypercube containing faulty components (nodes or links) is called an injured hypercube. To enable non-faulty nodes to communicate with each other in an injured hypercube, the information of component failu...
详细信息
ISBN:
(纸本)0897912780
A connected hypercube containing faulty components (nodes or links) is called an injured hypercube. To enable non-faulty nodes to communicate with each other in an injured hypercube, the information of component failures must be made available to those non-faulty nodes for them to route messages around the faulty components. We develop a fault-tolerant routing scheme which requires each node to know only the information on the failure of its own links. Performance of this scheme is rigorously analyzed. This scheme is not only shown to be capable of routing messages successfully in injured hypercubes when the number of component failures is less than n, but also proved to be able to choose a shortest path with a very high probability.
It is shown how to determine closed-form expressions for task scheduling delay and active task time distributions for any real-time system application, given a scheduling policy and task execution time distributions. ...
详细信息
It is shown how to determine closed-form expressions for task scheduling delay and active task time distributions for any real-time system application, given a scheduling policy and task execution time distributions. The active task time denotes the total time a task is executing or waiting to be executed, including scheduling delays and resource contention delays. The distributions are used to determine the probability of dynamic failure and processor utilization, where the probability of dynamic failure is the probability that any task will not complete before its deadline. The opposing effects of decreasing the probability of dynamic failure and increasing utilization are also addressed. The analysis first addresses workloads where all tasks are periodic, i.e., they are repetitively triggered at constant frequencies. It is then extended to include the arrival of asynchronously triggered tasks. The effects of asynchronous tasks on the probability of dynamic failure and utilization are addressed.< >
An approach to checkpointing and rollback recovery in a distributed computing system using a common time base is proposed. First, a common time base is established in the system using a hardware clock synchronization ...
详细信息
An approach to checkpointing and rollback recovery in a distributed computing system using a common time base is proposed. First, a common time base is established in the system using a hardware clock synchronization algorithm. This common time base is coupled with a pseudorecovery block approach to develop a checkpointing algorithm that has the following advantages: (i) maximum process autonomy, (ii) no wait for commitment for establishing recovery lines, (iii) fewer messages to be exchanged, and (iv) less memory requirement.< >
The reliability of a real-time digital control system depends not only on the reliability of the hardware and software used, but also on the speed in executing control algorithms. The latter is due to the negative eff...
详细信息
The reliability of a real-time digital control system depends not only on the reliability of the hardware and software used, but also on the speed in executing control algorithms. The latter is due to the negative effects of computingtime delay on control system performance. For a given sampling interval, the effects of computingtime delay are classified into the delay problem and the loss problem. Analysis of these two problems is presented as a means of evaluating real-time control systems. As an example, both the self-tuning predicted (STP) control and Proportional-Integral-Derivative (PID) control are applied to the problem of tracking robot trajectories, and their respective effects of computingtime delay on control performance are comparatively evaluated. For this exmple, the STP (PID) controller is shown to outperform the PID (STP) controller in coping with the delay (loss) problem.
暂无评论