This paper presents a checkpoint and recovery (C&R) protocol to support fault-tolerance for PVM (parallel Virtual Machine). The protocol helps to mask fail-stop failures from an application. The C&R activities...
详细信息
ISBN:
(纸本)9780769529172
This paper presents a checkpoint and recovery (C&R) protocol to support fault-tolerance for PVM (parallel Virtual Machine). The protocol helps to mask fail-stop failures from an application. The C&R activities are transparent and do not require any change in the PVM library nor operating system. In PVM, an application can change the number of processes during execution. This paper focuses on solving problems raised by the dynamic spawn and the asynchronous exit of tasks in PVM. The proposed protocol is a non-blocking one, so it reduces side-effect of checkpoint activities of original programs.
This is an overview of the robust resource allocation research efforts that have been and continue to be conducted by the CSU Robustness in Computer systems Group. parallel and distributed computing systems, consistin...
详细信息
We discuss the efficiency of a novel parallel/distributed application control method, based on global state monitoring. Processes report their local states to monitors. The monitors construct global states, analyze th...
详细信息
ISBN:
(纸本)9780769529172
We discuss the efficiency of a novel parallel/distributed application control method, based on global state monitoring. Processes report their local states to monitors. The monitors construct global states, analyze them and send control signals to processes when necessary. The addition of a special fast control network, responsible for transferring control information, is proposed in this paper. The efficiency is tested in simulation of sample Branch and Bound parallel computations. We show, that multicast capability of a network plays an important role in the resulting system efficiency. Other network parameters, such as latency or bandwidth, are significant only under proper conditions. We identify these conditions, demonstrating that 5-9 times speedup can be obtained by addition of a fast control network.
Overlay networks are a fascinating field in the area of distributedsystems. They combine challenges from self-organisation to extreme scalability and provide an interesting middleware layer for server-free Internet a...
详细信息
In this paper, we present a new fault tolerance system called DejaVu for transparent and automatic checkpointing, migration, and recovery of parallel and distributed applications. DejaVu provides a transparent paralle...
详细信息
This paper addresses the problems of admission control and server selection in a system consisting of several geographically replicated web servers and several access points. We propose a fully distributed solution in...
详细信息
ISBN:
(纸本)9780769529172
This paper addresses the problems of admission control and server selection in a system consisting of several geographically replicated web servers and several access points. We propose a fully distributed solution in which every access point continuously monitors the availability of all server side resources, using a mixture of active and passive measurements. Based on those measures, each access point autonomously applies its decisions to the requests it receives. Admission control is performed prioritizing requests belonging to already admitted sessions, in order to maximize the chance of successfully terminating ongoing sessions. Furthermore, session information is taken into account when performing a probabilistic request redirection and server choice, in order to improve load balancing and mitigate flash crowd effects. Extensive simulations, performed in compliance with industry standards, show that our method exhibits a stable behavior during overloads and improves service quality in terms of both reduced response time and higher successful session termination.
In distributed hybrid computing systems, traditional sequential processors are loosely coupled with reconfigurable hardware for optimal performance. This loose coupling proves to be a communication challenge;the proce...
详细信息
As the scale and proliferation of distributed applications continues to increase a need often arises to track the availability of entities that comprise the distributed system. An entity that is part of such a distrib...
详细信息
The development of Knowledge Discovery in databases (KDD) projects in collaborative and distributed environments requires facilities to search for choose, set-up, compose and execute suitable data manipulation tools. ...
详细信息
ISBN:
(纸本)9780978569914
The development of Knowledge Discovery in databases (KDD) projects in collaborative and distributed environments requires facilities to search for choose, set-up, compose and execute suitable data manipulation tools. This implies the necessity to explicitly represent and annotate different kinds of information about tools, data and their characteristics. In this framework, we are developing a service-oriented support platform called Knowledge Discovery in databases Virtual Mart. In this paper we discuss the design and implementation of the UDDI service broker a core element of the platform. We analyze the information needed to describe a tool in our platform, showing limitations of the present UDDI standard. Then, we present our solution to overcome such limitations and to extend UDDI broker capabilities.
In this paper, we introduce a 2-level distributed database architecture combined with the Group Registration (GR) location tracking strategy to be used in 3G wireless networks. With this strategy, the total location m...
详细信息
ISBN:
(纸本)9781424411436
In this paper, we introduce a 2-level distributed database architecture combined with the Group Registration (GR) location tracking strategy to be used in 3G wireless networks. With this strategy, the total location management cost is reduced by updating the location of MTs (Mobile Terminals) in an RA with a single route response message to the HSS (Home Subscriber Server). The Location Registration and Call Delivery procedures are presented in detail. The extensive analysis of the numerical results shows that the GR strategy integrated with a 2-level distributeddatabases architecture in 3G networks can achieve a significant reduction of the total cost per call arrival, as well as, a reduction of the call delivery latency, compared to the corresponding costs of the distributeddatabases without the GR strategy and the GR strategy without distributeddatabases.
暂无评论