Transactional memory (TM) is a parallel programming concept. Existing consistency protocols in distributed transactional memory system consume too much bandwidth and bring high latency. In this paper, we propose our T...
详细信息
Transactional memory (TM) is a parallel programming concept. Existing consistency protocols in distributed transactional memory system consume too much bandwidth and bring high latency. In this paper, we propose our Transaction memory Consistency Protocol (TMCP), and point the new features compared to the current protocols. After formulating our model and analyzing the performance, we found both too much and too little execution time will cause more conflicts, given that the execution time of transaction population follows Gamma distribution. We indicate that it is important to adjust the execution time to a reasonable value to improve performance.
In this paper we present a case study for a design of a reliable body area network (BAN) for monitoring fire fighter rescue teams according to the requirements defined by the Berlin fire brigades. This case study cons...
详细信息
ISBN:
(纸本)9781450300292
In this paper we present a case study for a design of a reliable body area network (BAN) for monitoring fire fighter rescue teams according to the requirements defined by the Berlin fire brigades. This case study considers all layers of the system, starting from the hardware, going through the operating system and data handling middleware, ending at the application layer. The main parts of the proposed solution are the tinyDSM middleware and a new prototyping hardware platform for wireless sensor nodes--IHPNode. The resulting BAN shall be a part of a larger system, where the BANs are connected via an additional multi-hop network to the control centre. Even if this connection fails, the BAN is able to take autonomous decisions. This system is developed within the FeuerWhere project.
THE CRAY X1 SUPERCOMPUTER'S distributed shared memory PRESENTS A 64-BIT GLOBAL ADDRESS SPACE THAT IS DIRECTLY ADDRESSABLE FROM EVERY MSP WITH AN INTERCONNECT BANDWIDTH PER COMPUTATION RATE OF 1 BYTE/FLOP. OUR RESU...
详细信息
THE CRAY X1 SUPERCOMPUTER'S distributed shared memory PRESENTS A 64-BIT GLOBAL ADDRESS SPACE THAT IS DIRECTLY ADDRESSABLE FROM EVERY MSP WITH AN INTERCONNECT BANDWIDTH PER COMPUTATION RATE OF 1 BYTE/FLOP. OUR RESULTS SHOW THAT THIS HIGH BANDWIDTH AND LOW LATENCY FOR REMOTE memory ACCESSES TRANSLATE INTO IMPROVED APPLICATION PERFORMANCE ON IMPORTANT APPLICATIONS.
We study the performance benefits of speculation in a release consistent software distributed shared memory system. We propose a new protocol, Speculative Home-based Release Consistency (SHRC), that speculatively upda...
详细信息
We study the performance benefits of speculation in a release consistent software distributed shared memory system. We propose a new protocol, Speculative Home-based Release Consistency (SHRC), that speculatively updates data at remote nodes to reduce the latency of remote memory accesses. Our protocol employs a predictor that uses patterns in past accesses to sharedmemory to predict future accesses. We have implemented our protocol in a release consistent software distributed shared memory system that runs on commodity hardware. We evaluate our protocol implementation using eight software distributed shared memory benchmarks and show that it can result in significant performance improvements.
Dynamic optimizers modify the binary code of programs at runtime by profiling and optimizing certain aspects of the execution. We present a completely software-based framework that dynamically optimizes programs for o...
详细信息
Dynamic optimizers modify the binary code of programs at runtime by profiling and optimizing certain aspects of the execution. We present a completely software-based framework that dynamically optimizes programs for object-based distributed shared memory (DSM) systems on clusters. In DSM systems, reducing the number of messages between cluster nodes is crucial. Prefetching transfers data in advance from the storage node to the local node so that communication is minimized. Our framework uses a profiler and a dynamic binary rewriter that monitor the access behavior of the application and place prefetches where they are beneficial to speed up the application. In addition, we use two distinct predictors to handle different types of access patterns. A meta-predictor analyzes the memory access behavior and dynamically enables one of the predictors. Our system also adapts the number of prefetches per request to best fit the application's behavior. The evaluation shows that the performance of our system is better than the manual prefetching. The number of messages sent decreases by up to 90%. Performance gains of up to 80% can be observed on benchmarks. Copyright (C) 2009 John Wiley & Sons, Ltd.
This article describes a parallel implementation of an algorithm for simulating mixed convective flow over a three-dimensional backward-facing step. A FORTRAN90 code was developed and parallelized using OpenMP directi...
详细信息
This article describes a parallel implementation of an algorithm for simulating mixed convective flow over a three-dimensional backward-facing step. A FORTRAN90 code was developed and parallelized using OpenMP directives for distributed shared memory (DSM) multiprocessors. Numerical experiments conducted on an IBM p5-575 multiprocessor show that the code achieves significant speed-up on up to 16 processors. Superlinear speed-up was also observed in some cases as a result of efficient cache utilization on the multiprocessor.
In this paper, we present a framework to formally describe and study the interconnection of distributed shared memory systems. Using it allows us to classify the consistency models in two groups, depending on whether ...
详细信息
In this paper, we present a framework to formally describe and study the interconnection of distributed shared memory systems. Using it allows us to classify the consistency models in two groups, depending on whether they are fast or not. In the case of non-fast consistency models, we show that they cannot be interconnected in any way. In contrast, in the case of fast consistency models we provide protocols to interconnect some of them. (C) 2008 Elsevier Inc. All rights reserved.
Despite the continuous advances of the last years in grid computing, programming paradigms are dominated by the message passing concept. There is little support for other paradigms such as shared data or associative p...
详细信息
Despite the continuous advances of the last years in grid computing, programming paradigms are dominated by the message passing concept. There is little support for other paradigms such as shared data or associative programming. In this paper, we analyse why previous attempts did not have a significant impact in the grid computing community. We start by assessing the landscape of grid programming solutions with a focus on shared data concepts. Next, we introduce an original idea to attack shared data programming on the grid by making use of both relaxed consistency models and user specified type consistency in an object-oriented model. Last but not least, we present a prototype architecture together with experimental results.
A large number of tasks in distributed systems can be traced down to the fundamental problem of attaining a consistent global view on a distributed computation. This problem has been addressed by a number of studies w...
详细信息
ISBN:
(纸本)1932415610
A large number of tasks in distributed systems can be traced down to the fundamental problem of attaining a consistent global view on a distributed computation. This problem has been addressed by a number of studies which focus on systems with message passing as their only means of interprocess communication. In the paper at hand we extend this restricted system model by additionally accounting for an abstract memory to be shared by the processes. We specify, necessary and sufficient conditions for constructing a consistent global view on such systems and present helpful definitions, which are meant to be a solid formal base for further studies.
The DSM-IV implicitly assumes that development is uniform across ability domains, which implies that relationships between ability measures do not differ across development. We assessed whether correlations between me...
详细信息
The DSM-IV implicitly assumes that development is uniform across ability domains, which implies that relationships between ability measures do not differ across development. We assessed whether correlations between measures of nine ability constructs differed across samples of children aged 3-5 (n = 117), 6-8 (n = 116), 9-11 (n = 124) and 12-14 years (n = 92). LISREL analyses show that correlations in each age group differ from those of each other age group. Parallel analyses indicate that the latent structure of ability differs across age groups. We conclude that shared maturational processes, including changes in the connectivity of neural systems, are responsible for decreasingly and increasingly strong relationships between some ability measures.
暂无评论