Withthe advent of multi-core processors, desktop application developers must finally face parallelcomputing and its challenges. A large portion of the computational load in a program rests within iterative computati...
详细信息
ISBN:
(纸本)9781424452910
Withthe advent of multi-core processors, desktop application developers must finally face parallelcomputing and its challenges. A large portion of the computational load in a program rests within iterative computations. In object-oriented languages these are commonly handled using iterators which are inadequate for parallel programming. Consequently, the powerful parallel Iterator concept was developed. this paper presents various developments of the parallel Iterator, such as parallel traversal of complex collections with partial ordering (such as a tree). Other features include reductions, parallel remove semantics and exception handling. Along withthe ease of use, the results reveal great speedup in comparison to traditional Java parallelism approaches.
Withparallelcomputing system scaling up, the system reliability drastically decreases, so parallelapplications running on such system must tolerate hardware failures. Checkpointing is widely used in the domain of l...
详细信息
ISBN:
(纸本)9781424452910
Withparallelcomputing system scaling up, the system reliability drastically decreases, so parallelapplications running on such system must tolerate hardware failures. Checkpointing is widely used in the domain of large-scale parallelcomputing, which periodically saves the state of computation to stable storage. this produces innegligible fault tolerance overhead. the traditional speedup only measures the performance of failure-free system. In this paper, we firstly propose the speedup metric taking into account checkpointing overhead. the new metric unifies the performance and reliability measures, and evaluates the practical speedup of parallel application with checkpointing. Furthermore, this paper classifies and analyzes existing parallel systems according to the proposed speedup metric, and makes suggestions on system design and fault tolerance techniques improvement. Finally, we validate the analysis of this new speedup metric by experiment. the experimental results indicate that the proposed speedup for parallel application with checkpointing is an effective metric.
Many data mining techniques have been proposed for parallelapplications performance analysis, the most interesting being clustering analysis. Most cases have been used to detect processors with similar behavior. In p...
详细信息
ISBN:
(纸本)9781424452910
Many data mining techniques have been proposed for parallelapplications performance analysis, the most interesting being clustering analysis. Most cases have been used to detect processors with similar behavior. In previous work, we presented a different approach: clustering was used to detect the computation structure of the applications and how these different computation phases behave. In this paper, we present a method to evaluate the accuracy of this structure detection. this new method is based on the Single Program Multiple Data (SPMD) paradigm exhibited by real parallel programs. Assuming an SPMD structure, we expect that all tasks of a parallel application execute the same operation sequence. Using a Multiple Sequence Alignment (MSA) algorithm, we check the sequence ordering of the detected clusters to evaluate the quality of the clustering results.
the development of multi-core processor technology makes parallel programming become more and more popular Similar to serial programs on single-core platforms, the locality optimization of parallel programs is and wil...
详细信息
ISBN:
(纸本)9780769536422
the development of multi-core processor technology makes parallel programming become more and more popular Similar to serial programs on single-core platforms, the locality optimization of parallel programs is and will be a hot-spot of research owing to the memory wall problem. In this paper we extend the famous data reuse theory to parallel domain and propose parallel data reuse theory for OpenMP applications. the parallel data reuse theory further classifies the reuse in parallel programs, from four classes to eight. this paper systemically discusses the intra-/inter-iteration reuse and intra-/inter-processor reuse in OpenMP programs, and gives the judging and solving method of each reuse class. Besides, this paper does the case study and analysis of SPEComp2001 benchmarks, using our parallel data reuse theory We believe that parallel data reuse theory, will have a big impact on the locality optimization of parallelapplications.
this paper proposes a data race prevention scheme, which can prevent data races in the View-Oriented parallel Programming (VOPP) model. VOPP is a novel shared-memory data-centric parallel programming;model, which uses...
详细信息
ISBN:
(纸本)9781424452910
this paper proposes a data race prevention scheme, which can prevent data races in the View-Oriented parallel Programming (VOPP) model. VOPP is a novel shared-memory data-centric parallel programming;model, which uses views to bundle mutual exclusion with data access. We have implemented the data race prevention scheme with a memory protection mechanism. Experimental results show that the extra overhead of memory protection is trivial in our applications. We also present a new VOPP implementation-Maotai 2.0, which has advanced features such as deadlock avoidance, producer/consumer view and system queues, in addition to the data race prevention scheme. the performance of Maotai 2.0 is evaluated and compared with modern programming models such as OpenMP and Cilk.
nowadays, more and more supercomputers are built on multi-core processors with shared caches. However, the conflict accesses to shared cache from different threads or processes become a performance bottleneck for para...
详细信息
ISBN:
(纸本)9781424452910
nowadays, more and more supercomputers are built on multi-core processors with shared caches. However, the conflict accesses to shared cache from different threads or processes become a performance bottleneck for parallelapplications. Cache partitioning can be used to allocate cache resources for different processes exclusively according to the demands of the processes. Conflicted accesses are avoided by restricting cache accesses to distinct private part of shared caches. this paper studies the problem of shared cache partition for balanced MPI parallelapplications in CMP architecture, presenting the performance oriented cache partitioning framework, including Spatial-Level Cache Partitioning(SLCP), Time-level Cache Partitioning(TLCP) and the evaluation of them. We evaluate SLCP and TLCP based on a quad-core simulator. Experiment shows that the SLCP and TLCP outperforms traditional LRU cache replacement policy in IPC throughput and miss rate metric. Specifically, for large workloads, TLCP outperforms LRU by up to 20% and on average 8.7%.
the content of technologies integration in regional land resources security and control system is analyzed in this paper, and on this basis, the key supporting technologies are explained. A theoretical framework is co...
详细信息
ISBN:
(纸本)9780769536422
the content of technologies integration in regional land resources security and control system is analyzed in this paper, and on this basis, the key supporting technologies are explained. A theoretical framework is constructed, which chooses GIS technology as the platform for technologies integration, while organization and management of data and models as the center, and thus to realize specific functions for regional land resources security and control system. Some key technical issues are also explored in the building process. Based on "service model layer" and learned from integration platform ideas, the overall integration framework of regional land resources security and control system is established
Fault tolerance is an important requirement for long-running parallel programs. this paper presents a different approach to fault-tolerance support in message-passing parallel programs based on their structural and be...
详细信息
ISBN:
(纸本)9781424452910
Fault tolerance is an important requirement for long-running parallel programs. this paper presents a different approach to fault-tolerance support in message-passing parallel programs based on their structural and behavioral characteristics, commonly known as patterns. A classification of these patterns and their applicable fault-tolerance strategies is aimed to facilitate an application developer to incorporate appropriate fault-tolerance strategies to an application. Fault-tolerance strategies for two of the patterns are discussed, and one specific strategy is elaborated and analyzed. the presented strategies have been incorporated into a fault-tolerance support framework called FT-PAS. One objective of the framework is to separate the fault tolerance related details from an application developer's main objectives (separation-of-concerns). the paper presents the additional key features of the framework, and concludes with a discussion on current and future research directions.
Topology embedding enables us to execute a protocol designed for a specific (virtual) topology on another (real) topology by embedding the virtual topology on the real topology. In this paper, we propose a self-stabil...
详细信息
ISBN:
(纸本)9781424452910
Topology embedding enables us to execute a protocol designed for a specific (virtual) topology on another (real) topology by embedding the virtual topology on the real topology. In this paper, we propose a self-stabilizing emulation technique that provides reliable communication on a virtual topology in the presence of transient faults. the proposed protocol improves the execution slowdown of previous protocols [7], [8] and provides adaptive message delivery delay on the emulated channels, which is a new type of adaptability against transient faults.
parallel programming is notoriously difficult. this becomes even more critical as multicore processors bring parallelcomputing into the mainstream. In order to ease the difficulty, tools have been designed that help ...
详细信息
ISBN:
(纸本)9781424452910
parallel programming is notoriously difficult. this becomes even more critical as multicore processors bring parallelcomputing into the mainstream. In order to ease the difficulty, tools have been designed that help the programmer with some aspects of parallelisation. Unfortunately, the programmer is mostly left along when it comes to the difficult task of dependence analysis among the subtasks to be executed concurrently. this paper presents a new visual tool that supports the programmer withthe dependence analysis in loops. this is very useful in combination with an automatically parallelising compiler or when loops are parallelised with OpenMP. the tool displays on-the-fly the dependences between the statements of the loop nest on which the developer is currently working. To maximise the usefulness of the tool, it is unobtrusive, customisable and flexible, and based on dependence analysis theory. A prototype was implemented for the Eclipse IDE as a plug-in that seamlessly integrates into the normal development process. the evaluation of the tool, including an evaluation against cognitive dimensions, demonstrates the usability and usefulness of the tool.
暂无评论