We presents a novel abstract individual-process crash-recovery model for non-volatile memory, which enables modularity, so that complex recoverable objects can be constructed in a modular manner from simpler recoverab...
详细信息
ISBN:
(纸本)9781450357951
We presents a novel abstract individual-process crash-recovery model for non-volatile memory, which enables modularity, so that complex recoverable objects can be constructed in a modular manner from simpler recoverable base objects. Within the framework of this model, we define nesting-safe recoverable linearizability (NRL) - a novel correctness condition that captures the requirements for nesting recoverable objects. Informally, NRL allows the recovery code to extend the interval of the failed operation until the recovery code succeeds to complete (possibly after multiple failures and recovery attempts). Unlike previous correctness definitions, the NRL condition implies that, following recovery, an implemented (higher-level) recoverable operation is able to complete its invocation of a base-object operation and obtain its response. We present algorithms for nesting-safe recoverable primitives, namely, recoverable versions of widely-used primitive shared-memory operations such as read, write, test-and-set and compare-and-swap, which can be used to implement higher-level recoverable objects. We then exemplify how these recoverable base objects can be used for constructing a recoverable counter object. Finally, we prove an impossibility result on wait-free implementations of recoverable test-and-set (TAS) objects from read, write and TAS operations, thus demonstrating that our model also facilitates rigorous analysis of the limitations of recoverable concurrent objects.
Recoverable mutual exclusion (RME) is a variation on the classic mutual exclusion (ME) problem that allows processes to crash and recover. The time complexity of RME algorithms is quantified in the same way as for ME,...
详细信息
ISBN:
(纸本)9781450357951
Recoverable mutual exclusion (RME) is a variation on the classic mutual exclusion (ME) problem that allows processes to crash and recover. The time complexity of RME algorithms is quantified in the same way as for ME, namely by counting remote memory references-expensive memory operations that traverse the processor-to-memory interconnect. Prior work has established that the RMR complexity of the RME problem for n processes is T(logn) for the class of algorithms that use read/write registers and single-word comparison primitives such as Compare-And-Swap (Golab and Ramaraju 2016), O(logn/log logn) for the class of algorithms that use read/write registers and additional single-word read-modify-primitives such as Fetch-And-Store (Golab and Hendler 2017), and T(1) for the class of algorithms that use read/write registers and specialized double-word read-modify-write primitives (Golab and Hendler 2017). These complexity bounds hold in a model of computation where processes may fail independently, and where a process that fails while accessing the mutex is required to recover eventually. This body of work leaves open two important questions: (i) what is the tight bound on the RMR complexity of RME for the class of algorithms that use read/write registers and commonly supported single-word read-modify-primitives;and (ii) how is the RMR complexity of RME affected by variations in the failure model? This paper answers both questions partially by showing that RME can be solved using O(1) RMRs per passage in the worst case in a model where failures are system-wide (i.e., all processes crash simultaneously), and processes receive additional information from the environment regarding the occurrence of the failure. The upper bound algorithm we present relies crucially on a novel RMR-efficient barrier that processes use to synchronize recovery actions after each failure. The barrier uses read/write registers and single-word Compare-And-Swap only. Additionally, we present a transfo
A long-standing open question has been whether lock-freedom and wait-freedom are fundamentally different progress conditions, namely, can the former be provided in situations where the latter cannot? This paper answer...
详细信息
ISBN:
(纸本)9781450357951
A long-standing open question has been whether lock-freedom and wait-freedom are fundamentally different progress conditions, namely, can the former be provided in situations where the latter cannot? This paper answers the question in the affirmative, by proving that there are objects with lock-free implementations, but without wait-free implementations-using objects of any finite power. We precisely define an object called n-process long-lived approximate agreement (n-LLAA), in which two sets of processes associated with two sides, 0 or 1, need to decide on a sequence of increasingly closer outputs. We prove that 2-LLAA has a lock-free implementation using reads and writes only, while n-LLAA has a lock-free implementation using reads, writes and (n - 1)-process consensus objects. In contrast, we prove that there is no wait-free implementation of the n-LLAA object using reads, writes and specific (n - 1)-process consensus objects, called (n - 1)-window registers.
In light of recent advances in non-volatile main memory technology, there has been a flurry of research in designing algorithms that are resilient to process crashes. As a result of main memory non-volatility, a proce...
详细信息
ISBN:
(纸本)9783030312770;9783030312763
In light of recent advances in non-volatile main memory technology, there has been a flurry of research in designing algorithms that are resilient to process crashes. As a result of main memory non-volatility, a process is allowed to crash any time during the execution, without affecting the state of the data stored in the main memory. With the assumption that a process eventually restarts after a crash, prior works have focused on designing mutual exclusion algorithms that use the non-volatile main memory to recover from such crashes. Such mutual exclusion algorithms that provide multiple processes with a mutually exclusive access to a shared resource in the presence of process crashes are called Recoverable Mutual Exclusion (RME) algorithms. We present the first RME algorithm where a process has the ability to abort executing the algorithm, if it decides to give up its request for a shared resource before being granted access to that resource. With n being the maximum number of processes for which the algorithm is designed, in the absence of a crash our algorithm guarantees a worst-case remote memory references (RMR) complexity of O(log n) per passage on the Distributed Shared Memory (DSM) machines, and a complexity of O(log n) or O(n) on Cache Coherent (CC) machines, depending on how caches are managed.
暂无评论