A distributed computing system consists of processing elements, communication links, memory units, data files, and programs. These resources are interconnected via a communication network and controlled by a distribut...
详细信息
A distributed computing system consists of processing elements, communication links, memory units, data files, and programs. These resources are interconnected via a communication network and controlled by a distributed operating system. The distributed program reliability in a distributed computing system is the probability that a program which runs on multiple processing elements and needs to retrieve data files from other processing elements will be executed successfully. This reliability varies according to (1) the topology of the distributed computing system, (2) the reliability of the communication edges, (3) the data files and programs distribution among processing elements, and (4) the data files required to execute a program. In this paper, we show that computing the distributed program reliability on the star distributed computing systems is NP-hard. We also develop an efficiently solvable case to compute distributed program reliability when some additional file distribution is restricted on the star topology.
distributed computing system (DCS) has become very popular for its high fault-tolerance, potential for parallel processing, and better reliability performance. One of the important issues in the design of the DCS is t...
详细信息
distributed computing system (DCS) has become very popular for its high fault-tolerance, potential for parallel processing, and better reliability performance. One of the important issues in the design of the DCS is the reliability performance. distributed program reliability (DPR) is addressed to obtain this reliability measure, In this paper, we propose a polynomial-time algorithm for computing the DPR of ring topology and show that solving the DPR problem on a ring of trees topology is NP-hard. (C) 2001 Elsevier Science Ltd. All rights reserved.
In this paper, we propose an approach to the reliability analysis of distributedprograms that addresses real-time constraints. Our approach is based on a model for evaluating transmission time, which allows us to fin...
详细信息
In this paper, we propose an approach to the reliability analysis of distributedprograms that addresses real-time constraints. Our approach is based on a model for evaluating transmission time, which allows us to find the time needed to complete execution of the program, task, or mission under evaluation. With information on time-constraints, the corresponding Markov state space can then be defined for reliability computation. To speed up the evaluation process and reduce the size of the Markov state space, several dynamic reliability-preserving reductions are developed. A simple distributed real-time system is used as an example to illustrate the feasibility and uniqueness of the proposed approach.
A distributed computing system consists of processing elements, communication links, memory units, data files, and programs. These resources are interconnected via a communication network and controlled by a distribut...
详细信息
A distributed computing system consists of processing elements, communication links, memory units, data files, and programs. These resources are interconnected via a communication network and controlled by a distributed operating system. The distributed program reliability (DPR) in a distributed computing system is the probability that a program which runs on multiple processing elements and needs to retrieve data files from other processing elements will be executed successfully. This reliability varies according to 1) the topology of the distributed computing system, 2) the reliability of the communication edges, 3) the data files and programs distribution among processing elements, and 4) the data files required to execute a program. In this paper, we show that computing the distributed program reliability on a star distributed computing system is #P-complete. A polynomially solvable case is developed for computing the distributed program reliability when some additional file distribution is restricted on the star topology. We also propose a polynomial time algorithm for computing the distributed program reliability with approximate solutions when the star star topology has no the additional file distribution.
Algorithm GEAR (Generalized Evaluation Algorithm for reliability) computes the reliability of a distributed computing system (DCS) which usually consists of processing element, memory unit, input/output devices, data-...
详细信息
Algorithm GEAR (Generalized Evaluation Algorithm for reliability) computes the reliability of a distributed computing system (DCS) which usually consists of processing element, memory unit, input/output devices, data-files, and pr as its shared resources. The probability that a task or an application can be computed successfully by sharing the required resources on the DCS is termed as the system reliability. Some of the important reliabilities defined using the above concept are terminal-pair, computer- network, distributed-program, and distributed-system. GEAR is general enough to compute all 4 of these parameters. GEAR is a 1-step algorithm and does not require any prior knowledge about multiterminal connections for computing reliability expression. Many examples are included to illustrate the usefulness of GEAR for computing reliability measures of a DCS.
A distributed system is a collection of processor-memory pairs connected by communication links. The reliability of a distributed system can be expressed using the distributed program reliability, and distributed syst...
详细信息
A distributed system is a collection of processor-memory pairs connected by communication links. The reliability of a distributed system can be expressed using the distributed program reliability, and distributed system reliability analysis. The computing reliability of a distributed system is an NP-hard problem. The distribution of programs & data-files can affect the system reliability. The reliability-oriented task assignment problem, which is NP-hard, is to find a task distribution such that the programreliability or system reliability is maximized. For example, efficient allocation of channels to the different cells can greatly improve the overall network throughput, in terms of the number of calls successfully supported. This paper presents a genetic algorithm-based reliability-oriented task assignment methodology (GAROTA) for computing the (k) over tilde -DTA reliability problem. The proposed algorithm uses a genetic algorithm to select a program & file assignment set that is maximal, or nearly maximal, with respect to system reliability. Our numerical results show that the proposed algorithm may obtain the exact solution in most cases, and the computation time seems to be significantly shorter than that needed for the exhaustive method. When the proposed method fails to give an exact solution, the deviation from the exact solution is very small. The technique presented in this paper would be helpful for readers to understand the correlation between task assignment reliability, and distributed system topology.
A distributed computing system is modeled as a collection of resources (e.g. processing elements, data tiles and programs) interconnected via an arbitrary communication network and controlled by a distributed operatin...
详细信息
A distributed computing system is modeled as a collection of resources (e.g. processing elements, data tiles and programs) interconnected via an arbitrary communication network and controlled by a distributed operating system, The distributed program reliability in a distributed computing system is the probability of successful execution of a program running on multiple processing elements and needs to retrieve data files from other processing elements. This reliability varies according to (1) the topology of the distributed computing system, (2) the reliability of the communication edges, (3) the data files and programs distribution among processing elements and (4) the data files required to execute a program. In addition, computing the reliability of distributed computing systems is #P-complete even when the distributed computing system is restricted to a series-parallel, a 2-tree, a tree, or a star structure. This paper presents efficient algorithms for computing the reliability of a distributedprogram running on other restricted classes of networks. (C) 1999 Elsevier Science Inc. All rights reserved.
The reliability of a distributedprogram in a distributed computing system is the probability that a program which runs on multiple processing elements and needs to communicate with other processing elements for remot...
详细信息
The reliability of a distributedprogram in a distributed computing system is the probability that a program which runs on multiple processing elements and needs to communicate with other processing elements for remote data files will be executed successfully. This reliability varies according to (1) the topology of the distributed computing system, (2) the reliability of the communication links, (3) the data files and program distribution among processing elements, and (4) the data files required to execute a program. This paper shows that solving this reliability problem is NP-hard even when the distributed computing system is restricted to a series-parallel, a 2-tree, a tree, or a star structure. (C) 1997 Elsevier Science B.V.
distributed Computing System (DCS) has become very popular for its high fault-tolerance, potential for parallel processing, and better reliability performance. One of the important issues in the design of the DCS is t...
详细信息
distributed Computing System (DCS) has become very popular for its high fault-tolerance, potential for parallel processing, and better reliability performance. One of the important issues in the design of the DCS is the reliability performance. distributed program reliability (DPR) has to be addressed to obtain this reliability measure. An efficient network topology is quite important for the distributed computing system. For example, the ring network has been widely used in current distributed system design. In this paper, we focus on DCS with ring topologies. We propose polynomialtime algorithms to analyze the DPR of the dual ring topology and show that solving the DPR problem on a ring of trees topology is NP-hard.
The reliability-oriented task assignment problem, which is NP-hard, is to find a task distribution such that the programreliability or systom reliability is maximized. In this paper, we have developed a reliability o...
详细信息
The reliability-oriented task assignment problem, which is NP-hard, is to find a task distribution such that the programreliability or systom reliability is maximized. In this paper, we have developed a reliability oriented task allocation scheme, based on a genetic algorithm, for distributed systems to find an appproximate solution. The simulation shows that, in most test cases, the algorithm finds sub-optimal solutions efficiently; therefore, it is a desirable approach to solve these problems. [PUBLICATION ABSTRACT]
暂无评论