A high speed trainset consists of a power car (or power cars) and trailer cars. Each car has its own processor which communicates via a network in the train. For dependability, most of data (objects) in memory are rep...
详细信息
A high speed trainset consists of a power car (or power cars) and trailer cars. Each car has its own processor which communicates via a network in the train. For dependability, most of data (objects) in memory are replicated on each processor. To design applications in the context of distributed systems, a simple and portable method consists in utilizing the distributed shared memory (for short DSM) paradigm. On a group of Sun Workstations, an evaluation model has been implemented consisting in a set of hierarchical software layers to offer a DSM with two consistency criteria, namely sequential consistency and causal consistency. On top of this DSM, the application layer implements a typical train application software namely door management. In this train application software, focus was put on portability, dependability and safety issues.
This paper introduces an optically interconnected distributed shared memory system. The distributed shared memory approach integrates both sharedmemory and distributedmemory system ideas to extract the strengths of ...
详细信息
This paper introduces an optically interconnected distributed shared memory system. The distributed shared memory approach integrates both sharedmemory and distributedmemory system ideas to extract the strengths of each while balancing their respective weaknesses. The system is a system based on a photonic network to support the high communication requirement of . The employs wavelength division multiple access on the photonic network, enabling multiple channels to be formed on a single optical fiber. A result of the high communication capacity is the simplification of the global address mapping problem. This simplified uniform address allocation scheme is introduced. The advantages of the proposed approach are examined through a performance analysis based on a closed queueing network which has been validated through extensive simulation. The performance of the system is evaluated in terms of transaction time of a memory request and system throughput. The impact of variations in the number of channels and processors in the system on these metrics is studied. The effect of variations in memory and channel service times are also evaluated.
The Adapteva Epiphany many-core architecture comprises a scalable 2D mesh Network-on-Chip (NoC) of low-power RíSC cores with minimal uncore functionality. Whereas such a processor offers high computational energy...
详细信息
The Adapteva Epiphany many-core architecture comprises a scalable 2D mesh Network-on-Chip (NoC) of low-power RíSC cores with minimal uncore functionality. Whereas such a processor offers high computational energy efficiency and parallel scalability, developing effective programming models that address the unique architecture features has presented many challenges. We present here a distributed shared memory (DSM) model supported in software transparently using C++ templated meta-programming techniques. The approach offers an extremely simple parallel programming model well suited for the architecture. Initial results are presented that demonstrate the approach and provide insight into the efficiency of the programming model and also the ability of the NoC to support a DSM without explicit control over data movement and localization.
We have developed a transaction-based approach to distributed shared memory(DSM) that supports object caching and generates path expression prefetches. A path expression specifies a path through the heap that traverse...
详细信息
We have developed a transaction-based approach to distributed shared memory(DSM) that supports object caching and generates path expression prefetches. A path expression specifies a path through the heap that traverses the objects to be prefetched. To our knowledge, this is the first prefetching approach that can prefetch objects whose addresses have not been computed or predicted. Our DSM uses both prefetching and caching of remote objects to hide network latency while relying on the two-phase transaction commit mechanism to preserve the simple transactional consistency model that we present to the developer. We have evaluated this approach on a matrix multiply benchmark. We have found that our approach enables to effectively utilize multiple machines in a cluster and also benefit from prefetching and caching of objects.
The large diffusion of multi-core machines has pushed the research in the field of Parallel Discrete Event Simulation (PDES) toward new programming paradigms, based on the exploitation of sharedmemory. On the opposit...
详细信息
The large diffusion of multi-core machines has pushed the research in the field of Parallel Discrete Event Simulation (PDES) toward new programming paradigms, based on the exploitation of sharedmemory. On the opposite side, the advent of Cloud computing and the possibility to group together many (low-cost) virtual machines to form a distributedmemory cluster capable of hosting simulation applications-has raised the need to bridge sharedmemory programming and seamless distributed execution. In this article, we present the design of a distributed middleware that transparently allows a PDES application coded for sharedmemory systems to run on clusters of (Cloud) resources. Our middleware is based on a synchronization protocol called Event and Cross State Synchronization. It allows cross-simulation-object access by event handlers, thus representing a powerful tool for the development of various types of PDES applications. We also provide data for an experimental assessment of our middleware architecture, which has been integrated into the open source ROOT-Sim speculative PDES platform.
This paper proposes a novel View-based Consistency model for distributed shared memory. A view is a set of ordinary data objects that a processor has ihe right to access in a data-race-free program. The View-based Con...
详细信息
This paper proposes a novel View-based Consistency model for distributed shared memory. A view is a set of ordinary data objects that a processor has ihe right to access in a data-race-free program. The View-based Consistency model requires that the data objects of a view are made up-to-date only before a processor accesses them. Compared with other memory consistency models, the View-based Consistency model can achieve data selection without user annotation and reduce more false sharing effect.
According to the uniform addressing and direct localization of network address space, adopting multi-threaded, multi-copy store and block storage in pages, page tree sub-node parallel failure methods, these can improv...
详细信息
According to the uniform addressing and direct localization of network address space, adopting multi-threaded, multi-copy store and block storage in pages, page tree sub-node parallel failure methods, these can improve the efficiency and parallel processing of the DSM system. It is significance for improving the performance of distributed systems.
This paper proposes a parallel file system model under NOWs (network of workstations) environment. According to the features of NOWs, the system incorporates the mechanism of distributed shared memory, particularly th...
详细信息
This paper proposes a parallel file system model under NOWs (network of workstations) environment. According to the features of NOWs, the system incorporates the mechanism of distributed shared memory, particularly the mechanism of COMA (cache only memory access). It links the memory of all nodes into a large cache;each node aggressively uses not only the local memory but also the remote memory of other nodes, which expedites the data accesses dramatically. It also accesses disks in parallel to improve I/O performance. Furthermore, in our model, data are shared naturally and conveniently among nodes as opposed to the traditional parallel file systems. The architecture of the parallel file system and its detailed implementation, such as file read and write, data replacement, file data dissipation, parallel file read and write are presented in this paper.
distributed shared memory (DSM) is an abstraction of sharedmemory on a distributed-memory machine. Hardware DSM systems support this abstraction at the architecture level;software DSM systems support the abstraction ...
详细信息
distributed shared memory (DSM) is an abstraction of sharedmemory on a distributed-memory machine. Hardware DSM systems support this abstraction at the architecture level;software DSM systems support the abstraction within the runtime system;One of the key problems in building an efficient software DSM system is to reduce the amount of communication needed to keep the distributed memories consistent. In this article we present four techniques for doing so: software release consistency;multiple consistency protocols;write-shared protocols;and an update-with-timeout mechanism. These techniques have been implemented in the Munin DSM system. We compare the performance of seven Munin application programs: first to their performance when implemented using message passing, and then to their performance when running on a conventional software DSM system that does not embody the preceding techniques. On a 16-processor cluster of workstations, Munin's performance is within 5% of message passing for four out of the seven applications. For the other three, performance is within 29 to 33%. Detailed analysis of two of these three applications indicates that the addition of a function-shipping capability would bring their performance to within 7% of the message-passing performance. Compared to a conventional DSM system, Munin achieves performance improvements ranging from a few to several hundred percent, depending on the application.
This paper describes the comparison between homeless and home-based Lazy Release Consistency (LRC) protocols which are used to implement distributed shared memory (DSM) in cluster computing. We present a performance e...
详细信息
This paper describes the comparison between homeless and home-based Lazy Release Consistency (LRC) protocols which are used to implement distributed shared memory (DSM) in cluster computing. We present a performance evaluation of parallel applications running on homeless and home-based LRC protocols. We compared the performance between Tread-Marks, which uses homeless LRC protocol, and our home-based DSM system. We found that the home-based DSM system has shown better scalability than TreadMarks in parallel applications we tested. This poor scalability in the homeless protocol is caused by a hot spot and garbage collection, but we have shown that these factors do not affect the scalability of the home-based protocol.
暂无评论