the proceedings contains 14 papers from the conference on the Proceedings of the ACM SIGPLAN symposium on principles and practice of parallel Programming, PPOPP. Topics discussed include: reference idempotency analysi...
详细信息
the proceedings contains 14 papers from the conference on the Proceedings of the ACM SIGPLAN symposium on principles and practice of parallel Programming, PPOPP. Topics discussed include: reference idempotency analysis: a framework for optimizing speculative execution;pointer and escape analysis for multithread programs;language support for motion-order matrices;efficient load balancing for wide-area divide-and-conquer applications;scalable queue-based spin locks with timeout;contention ellimination by replication of sequential sections in distributed shared memory programs;and accurate data redistribution cost estimation in software distributes shared memory systems.
this paper presents an efficient recovery scheme based on checkpointing and message logging for mobile computing systems. For the efficient management of checkpoints and message logs, a movement-based scheme is propos...
详细信息
ISBN:
(纸本)0769511538
this paper presents an efficient recovery scheme based on checkpointing and message logging for mobile computing systems. For the efficient management of checkpoints and message logs, a movement-based scheme is proposed. Mobile hosts carrying their recovery information to the nearby mobile support station can recover instantly in case of a failure, however, the cost to transfer the recovery information must be high. On the other hand, the recovery information remaining dispersed over a number of support stations visited by mobile hosts must incur very high recovery cost. To balance the failure;free operation cost and the recovery cost, in the proposed scheme, the recovery information of a mobile host remains at the visited support stations while the host moves within a certain range. Only when the host moves out of the range, the recovery information is transferred to a nearby mobile support station. As a result, the proposed scheme can control the information transfer cost as well as the recovery cost.
Withthe great progress of distributed object computing, more and more large systems are built using this technology. thus fault tolerance for distributed object computing is obviously a significant research domain. T...
详细信息
ISBN:
(纸本)0769514146
Withthe great progress of distributed object computing, more and more large systems are built using this technology. thus fault tolerance for distributed object computing is obviously a significant research domain. the Object Management Group (OMG) had recently published the "Fault Tolerant CORBA Specification V1.0". this specification defines how to achieve fault tolerance for distributed object computing using object group, and failure detection is one of the key elements for fault management, But the specification does not depict much about failure detection and leaves many specific details to venders. In this paper, we propose a simple mechanism for failure detection in distributed object computing. this mechanism is designed to be general rather than application-specific, with no single point of failure, and efficient. While the failure detectors may also crash during operating, we propose a method to handle this condition and to ensure the "no single point of failure" feature. the proposed mechanism has been implemented using CORBA to demonstrate that it works well.
In shared memory programs contention often occurs at the transition between a sequential and a parallel section of the code. As all threads start executing the parallel section, they often access data just modified by...
详细信息
ISBN:
(纸本)9781581133462
In shared memory programs contention often occurs at the transition between a sequential and a parallel section of the code. As all threads start executing the parallel section, they often access data just modified by the thread that executed the sequential section, causing a flurry of data requests to converge on that processor. We address this problem in a software distributed shared memory system by replicating the execution of the sequential sections on all processors. Communication during this replicated sequential execution is reduced by using multicast. We have implemented replicated sequential execution with multicast support in OpenMP/NOW, a version of of OpenMP that runs on networks of workstations. We do not rely on compile-time data analysis, and therefore we can handle irregular and pointer-based applications. We show significant improvement for two pointer-based applications that suffer from severe contention without replicated sequential execution.
Electro-magnetic field analysis applications based on the Method of Moments can be used to simulate the emissions for electrical devices such as a printed circuit board, a combination of circuit boards and wire connec...
详细信息
Distributing data is one of the key problems in implementing efficient distributed-memory parallel programs. the problem becomes more difficult in programs where data redistribution between computational phases is con...
详细信息
ISBN:
(纸本)9781581133462
Distributing data is one of the key problems in implementing efficient distributed-memory parallel programs. the problem becomes more difficult in programs where data redistribution between computational phases is considered. the global data distribution problem is to find the optimal distribution in multi-phase parallel programs. Solving this problem requires accurate knowledge of data redistribution cost. We are investigating this problem in the context of a software distributed shared memory (SDSM) system, in which obtaining accurate redistribution cost estimates is difficult. this is because SDSM communication is implicit: It depends on access patterns, page locations, and the SDSM consistency protocol. We have developed integrated compile- and run-time analysis for SDSM systems to determine accurate redistribution cost estimates with low overhead. Our resulting system, SUIF-Adapt, can efficiently and accurately estimate execution time, including redistribution, to within 5% of the actual time in all of our test cases and is often much closer. these precise costs enable SUIF-Adapt to find efficient global data distributions in multiple-phase programs.
the detection of process failures is a crucial problem system designers have to cope with in order to build fault-tolerant distributed platforms. Unfortunately, it is impossible to distinguish with certainty a crashed...
详细信息
ISBN:
(纸本)0769514146
the detection of process failures is a crucial problem system designers have to cope with in order to build fault-tolerant distributed platforms. Unfortunately, it is impossible to distinguish with certainty a crashed process from a very slow process in a purely asynchronous distributed system. this prevents some problems to be solved in such systems. that is why failure detector oracles have been introduced to circumvent these impossibility results. this paper presents a relatively simple protocol that allows a process to "monitor" another process, and consequently to detect its crash. this protocol enjoys the nice property to rely as much as possible on application messages to do this monitoring. Differently from previous process crash detection protocols, it uses control messages only when no application messages is sent by the monitoring process to the observed process. this protocol has noteworthy features. When the underlying system satisfies the partial synchrony assumption, it actually implements an eventually perfect failure detector (i.e., a failure detector of the class usually denoted lozengeP). Moreover, if the average observed transmission delay is finite and the upper lay er application terminates within a bounded number of steps for any failure detector in lozengeP after the failure detector becomes "perfect", then, when run withthe proposed protocol, it also terminates correctly. these properties make the protocol attractive: it is inexpensive, implementable, and powerful. the paper also describes performance measurements of an implementation of the protocol.
We have designed and implemented a lightweight process (thread) library called "Lesser Bear" for SMP computers. Lesser Bear has high portability and thread-level parallelism. Lesser Bear executes threads in ...
详细信息
We have designed and implemented a lightweight process (thread) library called "Lesser Bear" for SMP computers. Lesser Bear has high portability and thread-level parallelism. Lesser Bear executes threads in parallel by creating UNIX processes as virtual processors and a memory-mapped file as a huge shared-memory space. To schedule thread in parallel, the shared-memory space has been divided into working spaces for each virtual processor, and a ready queue has been distributed. But the previous version of Lesser Bear sometimes requires a lock operation for dequeueing. We therefore proposed a scheduling mechanism that does not require a lock operation. To achieve this, each divided space forms a rotatory topology through the queue, and we use a lock-free algorithm for the queue operation. this mechanism is applied to Lesser Bear and evaluated by experimental results.
In this paper we propose a solution to handle two problems inducted by the growth of the complexity of machine vision systems: (i) the need of a robust, open and flexible framework to control various descriptive and o...
详细信息
In this paper we propose a solution to handle two problems inducted by the growth of the complexity of machine vision systems: (i) the need of a robust, open and flexible framework to control various descriptive and operational knowledge;(ii) the necessity to have a architecture which offer parallelprocessingthat can be easily scaled to an evolving underlying hardware. We propose an agent society, implemented in the Java language, that is organized as an irregular pyramid for many reasons: (i) agent provides an abstraction to encapsulate reactive or cognitive processing;(ii) the pyramid proposes a formal graph-based approach to ensure global and distributed goal satisfaction. the evaluation of the architecture, performed on a X scanner breast image, shows up good quality results and parallelprocessing abilities.
Certain gateways (e.g., some cable or DSL modems) are known to have low reliability and low availability. Most failures of these devices can however be "fixed" by rejuvenating the device after a failure has ...
详细信息
ISBN:
(纸本)0769514146
Certain gateways (e.g., some cable or DSL modems) are known to have low reliability and low availability. Most failures of these devices can however be "fixed" by rejuvenating the device after a failure has been detected. Such a detection based rejuvenation strategy permits increasing the availability of these gateways. In the considered scenario, rejuvenation is non-trivial since a failure of such a gateway will leave it partitioned away from the network. In particular, network operators that want to rejuvenate these gateways are in a different network partition, and can therefore not initiate a remote rejuvenation. In this paper we propose a failure detection based rejuvenation service and a remote detection service. the rejuvenation service detects and fixes "soft" failures automatically (in one partition), and the detection service detects (in another partition) all rejuvenations exactly once. within a bounded amount of time, even when the gateway is rejuvenated consecutively. the detection service also allows the detection of "hard" failures, and filtering of notifications of soft failures.
暂无评论