software Transactional Memory (STM) systems are increasingly emerging as a promising alternative to traditional locking algorithms for implementing generic concurrent applications. To achieve generality, STM systems i...
详细信息
ISBN:
(纸本)9780769552071
software Transactional Memory (STM) systems are increasingly emerging as a promising alternative to traditional locking algorithms for implementing generic concurrent applications. To achieve generality, STM systems incur overheads to the normal sequential execution path, including those due to spin locking, validation (or invalidation), and commit/abort routines. We propose a new STM algorithm called Remote Invalidation (or RInval) that reduces these overheads and improves STM performance. RInval's main idea is to execute commit and invalidation routines on remote server threads that run on dedicated cores, and use cache-aligned communication between application's transactional threads and the server routines. By remote execution of commit and invalidation routines and cache-aligned communication, RInval reduces the overhead of spin locking and cache misses on shared locks. By running commit and invalidation on separate cores, they become independent of each other, increasing commit concurrency. We implemented RInval in the Rochester STM framework. Our experimental studies on micro-benchmarks and the STAMP benchmark reveal that RInval outperforms InvalSTM, the corresponding non-remote invalidation algorithm, by as much as an order of magnitude. Additionally, RInval obtains competitive performance to validation-based STM algorithms such as NOrec, yielding up to 2x performance improvement.
Reconfigurable hardware operating systems provide software-like abstractions for hardware accelerators. In particular abstractions that view hardware accelerators as threads and integrate them into a multi-threaded en...
详细信息
ISBN:
(纸本)9781665497473
Reconfigurable hardware operating systems provide software-like abstractions for hardware accelerators. In particular abstractions that view hardware accelerators as threads and integrate them into a multi-threaded environment have received popularity. However, such abstractions are not yet available for latest platform FPGAs. In this paper, we present ReconOS(64), a reconfigurable hardware operating system for 64-Bit modern platform FPGAs. We discuss the architecture and the build flow and report on a number of experiments that evaluate the performance of the system. In particular, we compare the performance to a previous, 32-Bit ReconOS system. The evaluation shows that the step towards 64-Bit is not only necessary to make hardware operating system support available for modern platform FPGAs, but also improves the performance of operating system calls and memory accesses for hardware threads.
Persistent memory modules with performance similar to that of standard DRAM are becoming commercially available. In addition to the potential decrease in cost and/or capacity, the nonvolatility of data stored in them ...
详细信息
ISBN:
(纸本)9780769546766
Persistent memory modules with performance similar to that of standard DRAM are becoming commercially available. In addition to the potential decrease in cost and/or capacity, the nonvolatility of data stored in them is opening new doors to improved performance and new capabilities in a wide range of applications. In this paper we address issues involved in providing the software support for effective use of such memory systems. In particular, we discuss the API to be used for accessing persistent memory and propose a novel scheme to provide atomic and in order memory access. Insuring that memory access are performed in order and if need be atomically has turned out to be a significant challenge in the presence of one or more levels of volatile caches. The solutions offered to this date either require disabling caching, significant changes in the hardware, or frequent use of cache flushes and memory fences. All these requirements have a significant impact on the system performance. In contrast, Eucalyptus presented in paper is a software only solution which relaxes these requirements by separating consistency and persistency of data. Preliminary results obtained from our prototype are presented as well.
Energy efficiency and energy conservation are one of the most crucial constraints for meeting the 20MW power envelope desired for exascale systems. Towards this, most of the research in this area has been focused on t...
详细信息
ISBN:
(纸本)9781728112466
Energy efficiency and energy conservation are one of the most crucial constraints for meeting the 20MW power envelope desired for exascale systems. Towards this, most of the research in this area has been focused on the utilization of user-controllable hardware switches such as per-core dynamic voltage frequency scaling (DVFS) and software controlled clock modulation at the application level. In this paper, we present a tuning plugin for the Periscope Tuning Framework which integrates fine-grained autotuning at the region level with DVFS and uncore frequency scaling (UFS). The tuning is based on a feed-forward neural network which is formulated using Performance Monitoring Counters (PMC) supported by x86 systems and trained using standardized benchmarks. Experiments on five standardized hybrid benchmarks show an energy improvement of 16.1% on average when the applications are tuned according to our methodology as compared to 7.8% for static tuning.
Self-adaptive clouds extend upstream the regular cloud platforms with special autonomy features dedicated to handling increasing workload and service failures. The identification of such features is not necessarily an...
详细信息
ISBN:
(纸本)9781479941162
Self-adaptive clouds extend upstream the regular cloud platforms with special autonomy features dedicated to handling increasing workload and service failures. The identification of such features is not necessarily an easy task. Sometimes those can be explicitly stated by QoS requirements or in preliminary material available to requirements engineers. Often though, they are implicit so that autonomy features capturing has to be undertaken. This paper elaborates on a methodology of capturing autonomy requirements for self-adaptive clouds with ARE, the Autonomy Requirements engineering approach. In this approach, autonomy features are detected as special self-* objectives backed up by different capabilities and quality characteristics.
Data versioning and renaming is a technique to enforce true dependencies and eliminate false dependencies in concurrent out-of-order execution. By extending the addressing to memory to support both a location and a ve...
详细信息
ISBN:
(纸本)9781538643686
Data versioning and renaming is a technique to enforce true dependencies and eliminate false dependencies in concurrent out-of-order execution. By extending the addressing to memory to support both a location and a version number, the memory system can match loads with the appropriate stores. With multiple versions of data for a single memory location, Write-after-Read and Write-after-Write dependencies are avoided. In this paper, we present architectural support for O-structures, which provide memory versioning and renaming. We describe a microarchitectural implementation of an O-structure in the cache hierarchy of a multicore processor and demonstrate the need of each feature provided by O-structures. Our evaluation shows that O-structures can be effective in supporting a range of parallel workloads, including irregular, pointer-heavy code.
Xgrid Technical Preview 2 is a distributed computing software technology from Apple Computer, Inc. Xgrid leverages the UNIX-based capabilities of Mac OS X to enable the rapid aggregation of Macintosh systems into a si...
详细信息
ISBN:
(纸本)0780321754
Xgrid Technical Preview 2 is a distributed computing software technology from Apple Computer, Inc. Xgrid leverages the UNIX-based capabilities of Mac OS X to enable the rapid aggregation of Macintosh systems into a simple yet powerful computational grid which can run a wide range of standard and custom solutions with minimal code changes. To demonstrate the technology, the software is run on a networked rack of Xserve G5 servers and a PowerBook G4 laptop. The demonstration shows how Xgrid is utilized to distribute long-running batch and parallel jobs to a local grid of Mac OS X-based computers.
This paper presents the design, development and evaluation of a software tool to assist the localisation of root causes of test case failures in distributed embedded systems, specifically vehicle systems controlled by...
详细信息
ISBN:
(纸本)9781538694435
This paper presents the design, development and evaluation of a software tool to assist the localisation of root causes of test case failures in distributed embedded systems, specifically vehicle systems controlled by a network of electronic control units (ECUs). We use data visualising to provide sensible information from a large number of test execution logs from large-scale software integration testing under a continuous integration process. Our goal is to allow more efficient root-cause identification of failures and foster a continuous feedback loop in the fault localisation process. We evaluate our solution in-situ at the Research and Development division of Volvo Car Corporation (VCC). Our prototype helps the failure debugging procedures by presenting clear and concise data and by allowing stakeholders to filter and control which information is displayed. Moreover, it encourages a systematic and continuous analysis of the current state of testing by aggregating and categorising historical data from test harnesses to identify patterns and trends in test results.
Rejuvenation is a technique expected to mitigate failures in HPC systems by replacing, repairing, or resetting system components. Because of the small overhead required by software rejuvenation, we primarily focus on ...
详细信息
To support multimedia applications in mobile environments, it will be necessary for applications to be aware of the underlying environmental conditions, and also to be able to adapt their behaviour and that of the und...
详细信息
To support multimedia applications in mobile environments, it will be necessary for applications to be aware of the underlying environmental conditions, and also to be able to adapt their behaviour and that of the underlying platform as such conditions change. Many existing distributedsystems platforms support such adaptation only in a rather ad hoc manner. This paper presents a principled approach to supporting adaptation through the use of reflection. More specifically, the paper introduces a language-independent, component-based reflective architecture featuring a per-component meta-space, the use of meta-models to structure meta-space, and a consistent use of component graphs to represent composite components. The paper also reports on a quality of service management framework, providing sophisticated support for monitoring and adaptation functions. Finally, the paper describes a prototype implementation of this architecture using the object-oriented programming language Python.
暂无评论