The symposium materials contain 24 papers. The topics covered include local area networks as distributed systems;operating systems research;scalable parallel computing;atomic snapshots;radio networks;fast deflection r...
详细信息
ISBN:
(纸本)0897916131
The symposium materials contain 24 papers. The topics covered include local area networks as distributed systems;operating systems research;scalable parallel computing;atomic snapshots;radio networks;fast deflection routing for packets and worms;parallel computing models;wait-free clock synchronization;optimal clock synchronization;partially synchronized clocks;resource bounds and combinations of consensus objects;scalable synchronization with minimal hardware support;adaptive solutions to the mutual exclusion problem;distributed fingerprints and secure information dispersal;replicated files;space complexity of randomized synchronization;lower bound on wait-free counting;and knowledge-oriented programming.
AN2 is a switch-based local area network under development at the Digital Equipment Corporation's Systems Research Center. A network is typically viewed as part of the infrastructure that enables distributed compu...
详细信息
ISBN:
(纸本)0897916131
AN2 is a switch-based local area network under development at the Digital Equipment Corporation's Systems Research Center. A network is typically viewed as part of the infrastructure that enables distributedcomputing. But AN2 and its predecessor AN1 are actually distributed systems in their own right. Hardware and software at the switches operate on local data and cooperate with parallel activities at other switches to manage the network and transmit data efficiently. Network components may fail, so fault tolerant operation is essential. Thus many of the issues and techniques of more traditional distributed systems are relevant for AN2. This paper describes aspects of the AN2 design where the distributed system model is especially appropriate and identifies areas where further work is needed.
The atomic snapshot object is an important primitive used for the design and verification of wait-free algorithms in shared memory distributed systems. A snapshot object is a shared data structure partitioned into seg...
详细信息
ISBN:
(纸本)0897916131
The atomic snapshot object is an important primitive used for the design and verification of wait-free algorithms in shared memory distributed systems. A snapshot object is a shared data structure partitioned into segments. Processors can either update an individual segment, or instantaneously scan all segments of the object. This paper presents an efficient implementation of an atomic snapshot object, in which each high level operation (scan or update) requires O(n log n) low level operations on atomic read/write registers.
In this paper we combine two previously disparate aspects of reliable distributedcomputing - self-stabilization, i.e., tolerance of systemic failures, and fault-tolerance, i.e., tolerance of process failures. We defi...
详细信息
ISBN:
(纸本)0897916131
In this paper we combine two previously disparate aspects of reliable distributedcomputing - self-stabilization, i.e., tolerance of systemic failures, and fault-tolerance, i.e., tolerance of process failures. We define what it means for a protocol to solve a problem while tolerating both types of failures and demonstrate a `compiler' that transforms a process failure-tolerant protocol for a synchronous system into a process and systemic failure-tolerant protocol. For asynchronous systems, we present a protocol that solves a crucial problem (Consensus) while tolerating both process and systemic failures.
Files are often replicated on several sites in a distributed system for efficient and fast accessibility purposes. Due to failures, some pages may et corrupted. Various researches have proposed solutions for identifyi...
详细信息
ISBN:
(纸本)0897916131
Files are often replicated on several sites in a distributed system for efficient and fast accessibility purposes. Due to failures, some pages may et corrupted. Various researches have proposed solutions for identifying the corrupted pages between two copies of a file. In this paper, we explore the more general case of identifying all corrupted pages among a set of copies larger than two. We develop an efficient and optimal solution for a coordinator-based model, where a single site acts as the coordinator and identifies all corrupted pages in all the copies.
The design and use of scalable parallel computers is discussed. Much of the knowledge gained in the study of distributd systems is relevant, although it needs to be used in new manners. The author argues that the chal...
详细信息
ISBN:
(纸本)0897916131
The design and use of scalable parallel computers is discussed. Much of the knowledge gained in the study of distributd systems is relevant, although it needs to be used in new manners. The author argues that the challenge is to achieve maximum reuse of existing technologies, microprocessor architectures, I/O systems, programming languages, operating systems, application codes, without compromising too much parallel performance and scalability. Such skillful compromises are essential to the success of the parallel processing industry, and pose even more intellectual challenges than the 'start from scratch' approach.
This paper presents a systematic, modular technique for transforming a large class of unbounded shared-memory algorithms into bounded algorithms. We show that any unbounded algorithm based on a certain asynchronous ro...
详细信息
ISBN:
(纸本)0897916131
This paper presents a systematic, modular technique for transforming a large class of unbounded shared-memory algorithms into bounded algorithms. We show that any unbounded algorithm based on a certain asynchronous rounds structure can be `compiled' into a bounded algorithm in a way that preserves correctness and running time. As evidence that the asynchronous rounds structure is natural for wait-free algorithms, we identify a number of unbounded algorithms in the literature to which our transformation can either be applied directly, or applied after minor modifications. The running times of the resulting algorithms match these of their unbounded counterparts and hence in most cases the resulting algorithms are faster than any other known bounded solutions to the corresponding problems. In particular, we get a bounded consensus algorithm whose running time is O(n log2 n) and a bounded snapshot scan algorithm whose running time is O(n log n).
Much of modern systems programming involves designing algorithms for distributed systems in which the nodes have access to information about time. Time information can be used to estimate the time at which system or e...
详细信息
ISBN:
(纸本)0897916131
Much of modern systems programming involves designing algorithms for distributed systems in which the nodes have access to information about time. Time information can be used to estimate the time at which system or environment events occur, to detect process failures, to schedule the use of resources, and to synchronize activities of different system components. In this paper we propose a simple programming model that is based on the timed automaton model of Lynch and Vaandrager, which gives algorithms direct access to perfectly accurate time information. Unfortunately, this programming model is not realistic. In a realistic distributed system, clocks have skew and a finite granularity. Furthermore, other details neglected by the timed automaton model such as processor step times must also be considered. We provide two simulations that show how to transform an algorithm designed in the simple programming model to run in a more realistic distributed system. One of our simulations is an extension of previous results on the use of inaccurate clocks by Lamport, Neiger and Toueg, and Welch. Our extensions suggest several powerful design techniques for algorithms that are to be run in distributed systems with clocks whose divergence from real time is bounded. We demonstrate these techniques by providing a new algorithm for distributed linearizable read-write objects. This algorithm significantly improves over previous results in terms of time complexity and algorithmic simplicity.
The symposium materials contain 21 papers. The topics covered include file systems, distributed systems and networking, paradigms for structuring systems, inter-process communication, performance issues, mobile comput...
详细信息
ISBN:
(纸本)0897916328
The symposium materials contain 21 papers. The topics covered include file systems, distributed systems and networking, paradigms for structuring systems, inter-process communication, performance issues, mobile computing, memory management, and computer system security.
暂无评论