the development of multi-core processor technology makes parallel programming become more and more popular Similar to serial programs on single-core platforms, the locality optimization of parallel programs is and wil...
详细信息
ISBN:
(纸本)9780769536422
the development of multi-core processor technology makes parallel programming become more and more popular Similar to serial programs on single-core platforms, the locality optimization of parallel programs is and will be a hot-spot of research owing to the memory wall problem. In this paper we extend the famous data reuse theory to parallel domain and propose parallel data reuse theory for OpenMP applications. the parallel data reuse theory further classifies the reuse in parallel programs, from four classes to eight. this paper systemically discusses the intra-/inter-iteration reuse and intra-/inter-processor reuse in OpenMP programs, and gives the judging and solving method of each reuse class. Besides, this paper does the case study and analysis of SPEComp2001 benchmarks, using our parallel data reuse theory We believe that parallel data reuse theory, will have a big impact on the locality optimization of parallelapplications.
this paper proposes a data race prevention scheme, which can prevent data races in the View-Oriented parallel Programming (VOPP) model. VOPP is a novel shared-memory data-centric parallel programming;model, which uses...
详细信息
ISBN:
(纸本)9781424452910
this paper proposes a data race prevention scheme, which can prevent data races in the View-Oriented parallel Programming (VOPP) model. VOPP is a novel shared-memory data-centric parallel programming;model, which uses views to bundle mutual exclusion with data access. We have implemented the data race prevention scheme with a memory protection mechanism. Experimental results show that the extra overhead of memory protection is trivial in our applications. We also present a new VOPP implementation-Maotai 2.0, which has advanced features such as deadlock avoidance, producer/consumer view and system queues, in addition to the data race prevention scheme. the performance of Maotai 2.0 is evaluated and compared with modern programming models such as OpenMP and Cilk.
nowadays, more and more supercomputers are built on multi-core processors with shared caches. However, the conflict accesses to shared cache from different threads or processes become a performance bottleneck for para...
详细信息
ISBN:
(纸本)9781424452910
nowadays, more and more supercomputers are built on multi-core processors with shared caches. However, the conflict accesses to shared cache from different threads or processes become a performance bottleneck for parallelapplications. Cache partitioning can be used to allocate cache resources for different processes exclusively according to the demands of the processes. Conflicted accesses are avoided by restricting cache accesses to distinct private part of shared caches. this paper studies the problem of shared cache partition for balanced MPI parallelapplications in CMP architecture, presenting the performance oriented cache partitioning framework, including Spatial-Level Cache Partitioning(SLCP), Time-level Cache Partitioning(TLCP) and the evaluation of them. We evaluate SLCP and TLCP based on a quad-core simulator. Experiment shows that the SLCP and TLCP outperforms traditional LRU cache replacement policy in IPC throughput and miss rate metric. Specifically, for large workloads, TLCP outperforms LRU by up to 20% and on average 8.7%.
In recent years, the cloud has become an attractive execution environment for parallelapplications, which introduces novel opportunities for versatile optimizations. Particularly promising in this context is the elas...
详细信息
ISBN:
(纸本)9789897584244
In recent years, the cloud has become an attractive execution environment for parallelapplications, which introduces novel opportunities for versatile optimizations. Particularly promising in this context is the elasticity characteristic of cloud environments. While elasticity is well established for client-server applications, it is a fundamentally new concept for parallelapplications. However, existing elasticity mechanisms for client-server applications can be applied to parallelapplications only to a limited extent. Efficient exploitation of elasticity for parallelapplications requires novel mechanisms that take into account the particular runtime characteristics and resource requirements of this application type. To tackle this issue, we propose an elasticity description language. this language facilitates users to define elasticity policies, which specify the elasticity behavior at both cloud infrastructure level and application level. Elasticity at the application level is supported by an adequate programming and execution model, as well as abstractions that comply withthe dynamic availability of resources. We present the underlying concepts and mechanisms, as well as the architecture and a prototypical implementation. Furthermore, we illustrate the capabilities of our approach through real-world scenarios.
We study the traffic characteristics of parallel and high performance computingapplications in this paper. applicationsthat utilize multiple cores are more and more common nowadays due to the emergence of multicore ...
详细信息
ISBN:
(纸本)9789897581397
We study the traffic characteristics of parallel and high performance computingapplications in this paper. applicationsthat utilize multiple cores are more and more common nowadays due to the emergence of multicore processors. However the design nature of single-threaded applications and multi-threaded applications can vary significantly. Furthermore the on-chip communication profile of multicore systems should be analysed and modelled for characterization and simulation purposes. We investigate several applications running on a full system simulation environment. the on-chip communication traces are gathered and analysed. We study the detailed low-level profiles of these applications. the applications are categorized into different groups according to various parallel programming paradigms. We discover that the trace data follow different parameters of power-law model. the problem is solved by applying least-squares linear regression. We propose a generic synthetic traffic model based on the analysis results.
the setup of distributedcomputing clusters and the installation of data analysis frameworks can be cumbersome and requires a great deal of knowledge in a plenitude of fields. We have developed DISCO, a service which ...
详细信息
ISBN:
(纸本)9781450351492
the setup of distributedcomputing clusters and the installation of data analysis frameworks can be cumbersome and requires a great deal of knowledge in a plenitude of fields. We have developed DISCO, a service which is alleviating the data scientist from these hurdles. this paper shows up the competitiveness of DISCO with an existing solution.
Fault tolerance is an important requirement for long-running parallel programs. this paper presents a different approach to fault-tolerance support in message-passing parallel programs based on their structural and be...
详细信息
ISBN:
(纸本)9781424452910
Fault tolerance is an important requirement for long-running parallel programs. this paper presents a different approach to fault-tolerance support in message-passing parallel programs based on their structural and behavioral characteristics, commonly known as patterns. A classification of these patterns and their applicable fault-tolerance strategies is aimed to facilitate an application developer to incorporate appropriate fault-tolerance strategies to an application. Fault-tolerance strategies for two of the patterns are discussed, and one specific strategy is elaborated and analyzed. the presented strategies have been incorporated into a fault-tolerance support framework called FT-PAS. One objective of the framework is to separate the fault tolerance related details from an application developer's main objectives (separation-of-concerns). the paper presents the additional key features of the framework, and concludes with a discussion on current and future research directions.
Topology embedding enables us to execute a protocol designed for a specific (virtual) topology on another (real) topology by embedding the virtual topology on the real topology. In this paper, we propose a self-stabil...
详细信息
ISBN:
(纸本)9781424452910
Topology embedding enables us to execute a protocol designed for a specific (virtual) topology on another (real) topology by embedding the virtual topology on the real topology. In this paper, we propose a self-stabilizing emulation technique that provides reliable communication on a virtual topology in the presence of transient faults. the proposed protocol improves the execution slowdown of previous protocols [7], [8] and provides adaptive message delivery delay on the emulated channels, which is a new type of adaptability against transient faults.
Large scale localization has gained interest in the last few years together with wireless sensor networks. applications such as a swarm of drones or a sensor network to control an environment already exist but do not ...
详细信息
ISBN:
(纸本)9781467394734
Large scale localization has gained interest in the last few years together with wireless sensor networks. applications such as a swarm of drones or a sensor network to control an environment already exist but do not cover scales up to thousands of nodes. In this paper, a method for large scale distributed localization is developed based on existing technologies including received signal strength and mass-spring model (MSM). First, we created a theoretical model based on specific indoor and outdoor tests. Based on this model, a scaled up simulation was done using an MSM-based algorithm. Using this simulation, an average error of up to 7.66m could be achieved in a field of 100m by 100m.
the content of technologies integration in regional land resources security and control system is analyzed in this paper, and on this basis, the key supporting technologies are explained. A theoretical framework is co...
详细信息
ISBN:
(纸本)9780769536422
the content of technologies integration in regional land resources security and control system is analyzed in this paper, and on this basis, the key supporting technologies are explained. A theoretical framework is constructed, which chooses GIS technology as the platform for technologies integration, while organization and management of data and models as the center, and thus to realize specific functions for regional land resources security and control system. Some key technical issues are also explored in the building process. Based on "service model layer" and learned from integration platform ideas, the overall integration framework of regional land resources security and control system is established
暂无评论