Achieving fault tolerance is one of the significant challenges of exascale computing due to projected increases in soft/transient failures. While past work on software-based resilience techniques typically focused on ...
详细信息
ISBN:
(数字)9781665415613
ISBN:
(纸本)9781665415620
Achieving fault tolerance is one of the significant challenges of exascale computing due to projected increases in soft/transient failures. While past work on software-based resilience techniques typically focused on traditional bulk-synchronous parallel programming models, we believe that Asynchronous Many-Task (AMT) programming models are better suited to enabling resiliency since they provide explicit abstractions of data and tasks which contribute to increased asynchrony and latency tolerance. In this paper, we extend our past work on enabling application-level resilience in single node AMT programs by integrating the capability to perform asynchronous MPI communication, thereby enabling resiliency across multiple nodes. We also enable resilience against fail-stop errors where our runtime will manage all re-execution of tasks and communication without user intervention. Our results show that we are able to add communication operations to resilient programs with low overhead, by offloading communication to dedicated communication workers and also recover from fail-stop errors transparently, thereby enhancing productivity.
Image processing promotes many of the technological advancements these days. The main aspect while performing image processing operations is the time taken to deal with the application of different routines on these i...
详细信息
ISBN:
(数字)9781728149882
ISBN:
(纸本)9781728149899
Image processing promotes many of the technological advancements these days. The main aspect while performing image processing operations is the time taken to deal with the application of different routines on these images. Thus, time is an important criterion for the efficiency of the systems. With the given situation, the idea of giving images to the processors and then depending upon code all the cores will be either dealing with one image and performing operations on the image or distributing the images to each core to perform the operations. This uses the idea of parallel programming i.e. the use of all computer resources that are cores here. The paper focuses on implementing different image-enhancing techniques integrated into a system that will execute it on single as well as multiple cores. The image processing operations implemented sequentially as well as parallelly in this paper are Image Blurring, Edge Detection, Contrast Stretching, and Image Negation the average speed for all the operations obtained when executed on multiple cores are 9.94, 9.54, 11.12, and 11.21 respectively.
A new common OpenMP based parallel programming method MPMC (multi-node paralleling model base on multiprocessor devices) is proposed and implemented for data separation based to accelerate Super-Resolution (SR) task. ...
详细信息
Concurrency bugs are difficult to diagnose and fix, due to the nature of the bugs and how they manifest themselves during execution. Traditional approaches for diagnosing concurrency bugs attempt to reproduce the exac...
详细信息
Explicit parallel programming for shared and distributed memory architectures is an efficient way to deal with data intensive computations. However approaches such as explicit threads or MPI remain difficult solutions...
详细信息
parallel computing is one of the top priorities in computer science. The main means of parallel processing information is a distributed computing system (CS)-a composition of elementary machines that interact through ...
详细信息
One of the barriers to the adoption of parallel computing is the inherent complexity of its programming. The Open Multi-Processing (OpenMP) Application programming Interface (API) facilitates such implementations, pro...
详细信息
A major driving force behind the increasing popularity of data science is the increasing need for data-driven analytics fuelled by massive amounts of complex data. Increasingly, parallel processing has become a cost-e...
详细信息
This work aims at distilling a systematic methodology to modernize existing sequential scientific codes with a limited re-designing effort, turning an old codebase into modern code, i.e., parallel and robust code. We ...
详细信息
ISBN:
(数字)9781728165820
ISBN:
(纸本)9781728165837
This work aims at distilling a systematic methodology to modernize existing sequential scientific codes with a limited re-designing effort, turning an old codebase into modern code, i.e., parallel and robust code. We propose an automatable methodology to parallelize scientific applications designed with a purely sequential programming mindset, thus possibly using global variables, aliasing, random number generators, and stateful functions. We demonstrate the methodology by way of an astrophysical application, where we model at the same time the kinematic profiles of 30 disk galaxies with a Monte Carlo Markov Chain (MCMC), which is sequential by definition. The parallel code exhibits a 12 times speedup on a 48-core platform.
Modern parallel platforms, such as clouds or servers, are often shared among many different jobs. However, existing parallel programming runtime systems are designed and optimized for running a single parallel job, so...
详细信息
Modern parallel platforms, such as clouds or servers, are often shared among many different jobs. However, existing parallel programming runtime systems are designed and optimized for running a single parallel job, so it is generally hard to directly use them to schedule multiple parallel jobs without incurring high overhead and inefficiency. In this work, we develop AMCilk (Adaptive Multiprogrammed Cilk), a novel runtime system framework, designed to support multiprogrammed parallel workloads. AMCilk has client-server architecture where users can dynamically submit parallel jobs to the system. AMCilk has a single runtime system that runs these jobs while dynamically reallocating cores, last-level cache, and memory bandwidth among these jobs according to the scheduling policy. AMCilk exposes the interface to the system designer, which allows the designer to easily build different scheduling policies meeting the requirements of various application scenarios and performance metrics, while AMCilk transparently (to designers) enforces the scheduling policy. The primary feature of AMCilk is the low-overhead and responsive preemption mechanism that allows fast reallocation of cores between jobs. Our empirical evaluation indicates that AMCilk incurs small overheads and provides significant benefits on application-specific criteria for a set of 4 practical applications due to its fast and low-overhead core reallocation mechanism.
暂无评论