distributed shared memory abstraction can coordinate a cluster of machine nodes to empower performance-critical queries with the scalable memory space and abundant parallelism. But to deploy the query under such an ab...
详细信息
ISBN:
(纸本)9781450384469
distributed shared memory abstraction can coordinate a cluster of machine nodes to empower performance-critical queries with the scalable memory space and abundant parallelism. But to deploy the query under such an abstraction, the general execution model just makes operators expressed as multiple subtasks and sequentially schedule them in parallel, while neglecting those vital dependencies between subtasks and data. In this paper, we conduct the in-depth researches about the issues (i.e., low CPU Utilization and poor data locality) raised by the ignorance of dependencies, and then propose a dependency-aware query execution model called Jasmine, which can (i) help users explicitly declare the dependencies and (ii) take these declared dependencies into the consideration of execution to address the issues. We invite our audience to use the rich graphical interfaces to interact with Jasmine to explore the dependency-aware query execution on distributed shared memory.
We present core elements of Samhita, a new user-level software distributed shared memory (DSM) system. Our work is motivated by two observations. First, the rise of many-core architectures is producing a growing empha...
详细信息
ISBN:
(纸本)9780769545769
We present core elements of Samhita, a new user-level software distributed shared memory (DSM) system. Our work is motivated by two observations. First, the rise of many-core architectures is producing a growing emphasis on threaded codes to achieve performance. Second, architectural trends, especially in high performance interconnects, suggest a new look at overcoming the bottlenecks that have hindered DSM performance. Samhita leverages the capabilities of remote direct memory access (RDMA) interconnects, and views the problem of providing a shared global address space as a cache management problem. Performance results on two 256 processor clusters demonstrate scalability on microbenchmarks and two real applications. The results are the largest scale tests and achieve the highest performance of any DSM system reported to date.
Two most commonly used classifications of reference locality are: temporal locality and spatial locality. This paper introduces a new class of reference locality, called Regional Locality, which is the program behavio...
详细信息
ISBN:
(纸本)3540653880
Two most commonly used classifications of reference locality are: temporal locality and spatial locality. This paper introduces a new class of reference locality, called Regional Locality, which is the program behavior that a set of addresses which are accessed in one critical or non-critical region will be very likely accessed as a whole in the same critical region or other nan-critical regions. We proposed three updates propagation protocols based on Regional Locality in distributed shared memory systems. These protocols include: Selective Lazy/Eager Updates Propagation protocol, First Hit Updates Propagation protocol, and Second Hit Updates Propagation protocol. Our experimental results indicate that Regional Locality exists in executions of many distributed shared memory concurrent programs. We have shown that the proposed protocols outperform the existing updates propagation protocols based on temporal locality. Exploring Regional Locality in other sharedmemory systems would be an interesting future research direction.
The paper presents distributed shared memory(DSM) for embedded control systems with CAN(Controller Area Network),which is widely used in various control *** provides a location-transparent environment,in which distrib...
详细信息
The paper presents distributed shared memory(DSM) for embedded control systems with CAN(Controller Area Network),which is widely used in various control *** provides a location-transparent environment,in which distributed tasks exchange data through the *** are a few DSM for distributed embedded control systems,which,however,do not provide efficient inter-node mutual *** paper presents DSM with prioritized mutual exclusion,which is efficiently implemented using the CAN arbitration *** DSM is based on a multiple reader/multiple writer model and supports entry *** mutual exclusion also supports the multiprocessor priority ceiling protocol to avoid deadlocks and reduce the blocking *** have built the DSM in a RTOS,which is an extension to an OSEK *** have also evaluated the performance of the DMS and we think that the performance is acceptable for practical embedded control systems.
Edge computing proposes access to largely unused computational resources without the added cost of the latency between the user and the Cloud. To take advantage of it we designed and implemented an abstraction layer c...
详细信息
ISBN:
(纸本)9783030612177;9783030612184
Edge computing proposes access to largely unused computational resources without the added cost of the latency between the user and the Cloud. To take advantage of it we designed and implemented an abstraction layer compatible with standard JavaScript that builds a distributed shared memory on top of any existing web browser, like the ones present in smartphones or tablets, and a cloud server, enabling developers to use existing application code and enhance it by enabling collaboration between those devices. The synchronization mechanism supports mixed consistency, preferring eventual consistency but providing a stronger serializability when required, allowing the developers to tune it to their specific needs.
distributed shared memory (DSM) systems on top of network of workstations are especially vulnerable to the impact of false sharing because of their higher memory transaction overheads and thus higher false sharing pen...
详细信息
distributed shared memory (DSM) systems on top of network of workstations are especially vulnerable to the impact of false sharing because of their higher memory transaction overheads and thus higher false sharing penalties. In this paper we develop a dynamic-granularity sharedmemory management scheme that eliminates false sharing without sacrificing the transparency to conventional shared-memory applications. Our approach utilizes a special threaded splay tree (TST) for sharedmemory information management, and a dynamic token-based path-compression synchronization algorithm for data transferring. The combination of the TST and path compression is quite efficient;asymptotically, in an n-processor system with sharedmemory segments, synchronizing at most s segments takes O(s log m log n) amortized computation steps and generates O(s log n) communication messages, respectively. Based on the proposed scheme we constructed an experimental DSM prototype which consists of several Ethernet-connected Pentium-based computers running Linux. Preliminary benchmark results on our prototype indicate that our scheme is quite efficient, significantly outperforming traditional schemes and scaling up well.
This paper investigates the problem of network partitioning in distributed shared memory (DSM) systems. We propose an optimistic-based partition-processing approach, which can make shared pages available when network ...
详细信息
This paper investigates the problem of network partitioning in distributed shared memory (DSM) systems. We propose an optimistic-based partition-processing approach, which can make shared pages available when network partitioning occurs. However, this approach does not guarantee that the same page in different partitions can maintain a consistent value. To eliminate this problem, a memory-based coordinated checkpoininting scheme is presented to save consistent states at low cost. If there are inconsistencies between two partitions, one saved consistent state is chosen to perform backward error recovery. Extensive trace-driven simulations have been performed to evaluate the effects of the proposed approach on system performance.
The efficiency of Software distributed shared memory (DSM) is often limited by the excessive amount of network communication in maintaining the memory consistency of the system. Two of the most popular software soluti...
详细信息
The efficiency of Software distributed shared memory (DSM) is often limited by the excessive amount of network communication in maintaining the memory consistency of the system. Two of the most popular software solutions to reduce redundant data traffic are relaxed memory consistency models and traffic-thrifty coherence protocols. In this paper, we propose the migrating-home protocol for a relaxed memory consistency model, the scope consistency model. The protocol allows the processor storing the most up-to-date copy of a page to change from one processor to another, so as to better adapt to the memory access patterns of DSM applications. The new protocol has been implemented in a DSM system running on a 16-node Pentium III 450MHz PC cluster. We analyzed not only the execution time of the benchmark programs, but also the communication and page fault patterns via a new analysis approach. It is shown that our DSM system reduces the amount of network communication and handles page faults more efficiently. The benchmark results provide concrete evidence for the substantial performance improvement obtained by our system.
Software distributed shared memory (DSM) provides a convenient and effective solution for programming parallel applications on distributed systems. However, the performance of current implementations suffers from larg...
详细信息
Software distributed shared memory (DSM) provides a convenient and effective solution for programming parallel applications on distributed systems. However, the performance of current implementations suffers from large overhead in enforcing memory coherence. Coherence faults are the sources of massive network traffic. Various memory consistency models have been proposed in order to eliminate the effects of network traffic and memory latency. In this paper, we present a novel approach that combines relaxed memory consistency models and a compiler strategy to solve memory coherence problems for DSM. This approach produces fewer coherence faults. Experimental results also show this hybrid approach is effective for reducing the memory coherence overhead of DSM.
暂无评论