The proceedings contain 6 papers. The topics discussed include: sharing a connectivity via informed proxy selection;message-oriented middleware for edge computing applications;continuous experimentation for software d...
ISBN:
(纸本)9781450351997
The proceedings contain 6 papers. The topics discussed include: sharing a connectivity via informed proxy selection;message-oriented middleware for edge computing applications;continuous experimentation for software developers;end-to-end regression testing for distributedsystems.towards a framework for orchestrated distributeddatabase evaluation in the cloud;toward software updates in IoT environments: why existing P2P systems.are not enough;and towards accelerating synchrophasor based linear state estimation of power grid systems.
Cycles and knots in directed graphs are problems that can be associated with deadlocks in database and communication systems. Many algorithms to detect cycles and knots in directed graphs were proposed. Boukerche and ...
详细信息
ISBN:
(纸本)0769518532
Cycles and knots in directed graphs are problems that can be associated with deadlocks in database and communication systems. Many algorithms to detect cycles and knots in directed graphs were proposed. Boukerche and Tropper have proposed a distributed algorithm that solve the problem in a efficient away. Their algorithm has a message complexity of 2m vs. (at least) 4m for the Chandy and Misra algorithm, where in is the number of links in the graph, and requires 0 (n log n) bits of memory, where n is the number of nodes. We have implemented Boukerche and Tropper's algorithm according to the construction of processes of the CSP model. Our implementation was done. using JCSP, an implementation of CSP for Java, and the results are presented.
The goal of checkpointing in database management systems.is to save database states on a separate secure device so that the database can be recovered when errors and failures occur. Recently, the possibility of having...
详细信息
ISBN:
(纸本)0818607491
The goal of checkpointing in database management systems.is to save database states on a separate secure device so that the database can be recovered when errors and failures occur. Recently, the possibility of having a checkpointing mechanism which does not interfere with the transaction processing has been studied. The property of noninterference is highly desirable in real-time applications, where restricting transaction activity during the checkpointing operation is in many cases not feasible. The practicality of a noninterfering checkpointing algorithm is studied here by analyzing the extra workload of the algorithm. The noninterfering checkpointing results in some overhead on the one hand, and increases the system availability on the other hand. For the applications where the ability of continuous processing of transactions is so critical that the blocking of transaction processing for checkpointing is not feasible, it is believed that noninterfering checkpointing provides a practical solution to the problem of constructiing globally consistent states in distributeddatabasesystems.
An analytic model for estimating the task response in loosely coupled distributedsystems.is introduced. The model considers such factors as the precedence relationships among software modules, interprocessor communic...
详细信息
ISBN:
(纸本)0818605669
An analytic model for estimating the task response in loosely coupled distributedsystems.is introduced. The model considers such factors as the precedence relationships among software modules, interprocessor communication, interconnection network delay, module scheduling policy, and assignment of modules to computers. Simulation experiments are used to validate the assumptions of the analytic model. Applications of the model to the study of design issues for distributedsystems.such as module assignment, precedence relationships, module scheduling policies, and database management algorithms are discussed.
The reliability issue in deduplication-based storage systems.has not received adequate attention. Existing approaches introduce data redundancy after files have been deduplicated, either by replication on critical dat...
详细信息
ISBN:
(纸本)9781538683019
The reliability issue in deduplication-based storage systems.has not received adequate attention. Existing approaches introduce data redundancy after files have been deduplicated, either by replication on critical data chunks, i.e., chunks with high reference count, or RAID schemes on unique data chunks, which means that these schemes are based on individual unique data chunks rather than individual files. This can leave individual files vulnerable to losses, particularly in the presence of transient and unrecoverable data chunk errors such as latent sector errors. To address this file reliability issue, this paper proposes a Per-File Parity (short for PFP) scheme to improve the reliability of deduplication-based storage systems. PFP computes the XOR parity within parity groups of data chunks of each file after the chunking process but before the data chunks are deduplicated. Therefore, PFP can provide parity redundancy protection for all files by intra-file recovery and a higher-level protection for data chunks with high reference counts by inter-file recovery. Our reliability analysis and extensive data-driven, failure-injection based experiments conducted on a prototype implementation of PFP show that PFP significantly outperforms the existing redundancy solutions, DTR and RCR, in system reliability, tolerating multiple data chunk failures and guaranteeing file availability upon multiple data chunk failures. Moreover, a performance evaluation shows that PFP only incurs an average of 5.7% performance degradation to the deduplication-based storage system.
A new reliability model is introduced for selecting the best software fault-tolerant (FT) design. This model uses a task graph technique that allows different candidate FT configurations to be analyzed based on the st...
详细信息
ROSE, a modular distributed operating system that provides support for building reliable applications, is designed and implemented. Failure detection capabilities are provided by a failure detection server. Configurat...
详细信息
ROSE, a modular distributed operating system that provides support for building reliable applications, is designed and implemented. Failure detection capabilities are provided by a failure detection server. Configuration objects can be used to capture the relationship among multiple processes that cooperate to replicate certain resources. Replicated address space (RAS) objects, whose content is accessible with a high probability despite hardware failures, can be used to increase data availability. Finally, a resistant process (RP) abstraction allows user processes to survive hardware failures with minimal interruption. Two different implementations of RP are provided: one checkpoints the information about its state in an RAS object periodically;the other uses replicated execution by executing the same code in different nodes at the same time.
software Transactional Memories (STMs) are emerging as a highly attractive programming model, thanks to their ability to mask concurrency management issues to the overlying applications. In this paper we are intereste...
详细信息
The authors present a novel formal approach to proving the correctness of distributedsystems.of replicated processes that commuicate by message passing. The notion of correctness introduced is based on the consistenc...
详细信息
ISBN:
(纸本)0818622601
The authors present a novel formal approach to proving the correctness of distributedsystems.of replicated processes that commuicate by message passing. The notion of correctness introduced is based on the consistency of the replicated system with its nonreplicated counterpart. The formal framework of CSP (communicating sequential processes) allows the proof of partial correctness and deadlock-freedom properties of the systems.of replicated processes. The authors also discuss how a replicated process may be implemented by N-base copies, a majority of which are non-faulty, and point out the necessity of coordinating the copies and the requirements they should satsify.
The problem of replica coordination is fundamental to building Byzantine fault-tolerant (BFT) distributedsystems. Seminal BFT architectures for safety-critical real-time systems.from the eighties and nineties relied ...
详细信息
ISBN:
(数字)9781665453462
ISBN:
(纸本)9781665453462
The problem of replica coordination is fundamental to building Byzantine fault-tolerant (BFT) distributedsystems. Seminal BFT architectures for safety-critical real-time systems.from the eighties and nineties relied on custom processors and networks, and are hence not readily usable today. Modern-day deployments on cloud platforms do not "scale down" to embedded platforms and are not designed around timeliness. Recent work on real-time BFT protocols focuses on simulations and reliability analyses. In short, there exist no easily programmable BFT libraries that can be conveniently retrofitted onto real-time applications with deadlines and that perform well on embedded platforms. We propose In-ConcReTeS, a BFT key-value store designed for building highly reliable control applications on commodity embedded platforms. At its core, In-ConcReTeS is a real-time friendly redesign and an efficient implementation of a BFT protocol used by seminal fault-tolerant architectures. We evaluated In-ConcReTeS using an inverted pendulum simulation and an automotive benchmark on a cluster of four Raspberry Pis connected over Ethernet. Our results show that, unlike Redis and etcd, In-ConcReTeS can repeatedly synchronize hundreds of key-value pairs, while tolerating faults, every tens of milliseconds.
暂无评论