检索结果-内蒙古大学图书馆

task-based parallel programming for Gate Sizing

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 2023年第4期42卷 1309-1322页

作者： Mangiras, Dimitrios Chinnery, David Dimitrakopoulos, Giorgos Democritus Univ Thrace Dept Elect & Comp Engn Xanthi 67100 Greece Siemens Digital Ind Software Digital Implementat Div Fremont CA 94538 USA

Physical synthesis engines need to embrace all available parallelism to cope with the increasing complexity of modern designs and still offer high quality of results. To achieve this goal, the involved algorithms need to be expressed in a way that facilitates fast execution time across a range of computing platforms. In this work, we introduce a task-based parallel programming template that can be used for speeding up timing and power optimization. This approach utilizes all available parallelism and enables significant speedup relative to custom multithreaded approaches. task-based parallelism is applied to all parts of the optimization engine covering also parts that are traditionally executed serially for preserving maximum timing accuracy. Using taskflow as the parallel programming and execution engine, we achieved a speedup of 1.7x to 2.8x for gate sizing optimizations on the ISPD13 benchmarks with marginal extra leakage power relative to state-of-the-art multithreaded gate sizers. This result was supported by two dynamic heuristics that restrict the number of examined gate sizes and simplify local timing updates. Both heuristics tradeoff additional runtime reduction with marginal leakage power increases.

关键词： Electronic design automation physical design task-based parallel programming timing and power optimization

来源：评论

学校读者我要写书评

暂无评论

Lifeline-based load balancing schemes for Asynchronous Many-task runtimes in clusters

引用

parallel COMPUTING 2023年第1期116卷

作者： Reitz, Lukas Hardenbicker, Kai Werner, Tobias Fohry, Claudia Univ Kassel Res Grp Programming Languages Methodol Wilhelmshoher Allee 71-73 D-34121 Kassel Germany

A popular approach to program scalable irregular applications is Asynchronous Many-task (AMT) Program-ming. Here, programs define tasks according to task models such as dynamic independent tasks (DIT) or nested fork-join (NFJ). We consider cluster AMTs, in which a runtime system maps the tasks to worker threads in multiple ***, dynamic load balancing can be achieved via cooperative work stealing, coordinated work stealing, or work sharing. A well-performing cooperative work stealing variant is the lifeline scheme. While previous implementations of this scheme are restricted to single-worker processes, a recent hybrid extension combines it with intra-process work sharing between multiple workers. The hybrid scheme, which was proposed for both DIT and NFJ, comes at the price of a higher *** paper investigates whether this complexity is indispensable for multi-worker processes by contrasting the hybrid scheme with a novel pure work stealing extension of the lifeline scheme to multiple workers. We independently implemented the extension for DIT and NFJ. In experiments based on four benchmarks, we observed the pure scheme to be on a par or even outperform the hybrid one by up to 18% for DIT and up to 5% for *** on this main result, we studied a modification of the pure scheme, which prefers local over global victims, and more heavily loaded over less loaded ones. The modification improves the performance of the pure scheme by up to 15%. Finally, we explored whether the lifeline scheme can profit from a change to coordinated work stealing. We developed a coordinated multi-worker implementation for DIT and observed a performance improvement over the cooperative scheme by up to 17%.

关键词： Work stealing Work sharing Runtime systems Asynchronous Many-tasking task-based parallel programming

来源：评论

学校读者我要写书评

暂无评论

DuctTeip: An efficient programming model for distributed task-based parallel computing

引用

parallel COMPUTING 2019年 90卷 102582-102582页

作者： Zafari, Afshin Larsson, Elisabeth Tillenius, Martin Uppsala Univ Dept Informat Technol Box 337 SE-75105 Uppsala Sweden

Current high-performance computer systems used for scientific computing typically combine shared memory computational nodes in a distributed memory environment. Extracting high performance from these complex systems requires tailored approaches. task-based parallel programming has been successful both in simplifying the programming and in exploiting the available hardware parallelism for shared memory systems. In this paper we focus on how to extend task-parallel programming to distributed memory systems. We use a hierarchical decomposition of tasks and data in order to accommodate the different levels of hardware. We test the proposed programming model on two different applications, a Cholesky factorization, and a solver for the Shallow Water Equations. We also compare the performance of our implementation with that of other frameworks for distributed task-parallel programming, and show that it is competitive. (C) 2019 Elsevier B.V. All rights reserved.

关键词： task-based parallel programming Distributed memory system High performance computing Scientific computing Hierarchical decomposition Data versioning

来源：评论

学校读者我要写书评

暂无评论

Transparent Resource Elasticity for task-based Cluster Environments with Work Stealing 21

Transparent Resource Elasticity for Task-Based Cluster Envir...

引用

50th International Conference on parallel Processing (ICPP)

作者： Posner, Jonas Fohry, Claudia Univ Kassel Res Grp Programming Languages Methodol Kassel Germany

ISBN: (纸本)9781450384414

Resource elasticity allows to dynamically change the resources of running jobs, which may significantly improve the throughput on supercomputers. Elasticity requires support from both job schedulers and user applications. Whereas the adaptation of traditional programs requires additional programmer effort, task-based programs can be made elastic in a transparent way. In this paper, we propose a corresponding technique for implementation in a runtime system. We refer to a work stealing-based runtime for clusters, which uses the lifeline scheme for victim selection, combines inter-node work stealing with intra-node work sharing, and handles dynamic independent tasks, i.e., tasks that may spawn child tasks but do not otherwise cooperate. We experimentally assess the elasticity overhead of our scheme and find that adding/releasing up to 64 nodes takes less than 0.5 seconds. This value is determined with the help of a new formula that estimates the overhead-free running time of work-stealing programs with a changing number of workers. Using this result, we then quantify the gain of deploying elastic jobs. For that, we simulate the execution of job sets that contain some percentage of elastic jobs on two hypothetical supercomputers. We use an existing elastic job scheduler, which we concretize, e.g. by a new heuristic to determine the minimum, maximum, and preferred number of nodes for a job. Results show that the makespan can be reduced by up to 20% if most jobs are elastic.

关键词： Resource Elasticity Adaptive Resource Management Malleability task-based parallel programming Runtime Systems

来源：评论

学校读者我要写书评

暂无评论

Load Balancing Policies for Nested Fork-Join

Load Balancing Policies for Nested Fork-Join

引用

IEEE International Conference on Cluster Computing (Cluster)

作者： Reitz, Lukas Univ Kassel Programming Languages Methodol Kassel Germany

ISBN: (纸本)9781728196664

Irregularity in parallel programs is often tackled by dynamic load balancing. Two well-known techniques, namely work sharing and work stealing, were recently combined into a hybrid scheme, which uses the former for shared-memory, and the latter for distributed-memory load balancing. Nested fork-join programs are an important type of task-based parallel programs, where tasks generate child tasks giving rise to an execution tree. The hybrid scheme was originally designed for a different task model. We transfer the hybrid scheme to nested fork-join programs, and compare different variants of the scheme, called load balancing policies. Experimental results show that the considered policies perform well for the used benchmarks with no significant difference between them.

关键词： Asynchronous Many task Work Stealing Work Sharing Runtime Systems task-based parallel programming

来源：评论

学校读者我要写书评

暂无评论

Checkpointing vs. Supervision Resilience Approaches for Dynamic Independent tasks

Checkpointing vs. Supervision Resilience Approaches for Dyna...

引用

35th IEEE International parallel and Distributed Processing Symposium (IPDPS)

作者： Posner, Jonas Reitz, Lukas Fohry, Claudia Univ Kassel Res Grp Programming Languages Methodol Kassel Germany

ISBN: (纸本)9781665435772

With the advent of exascale computing, issues such as application irregularity and permanent hardware failure are growing in importance. Irregularity is often addressed by task-based parallel programming coupled with work stealing. At the task level, resilience can be provided by two principal approaches, namely checkpointing and supervision. For both, particular algorithms have been worked out recently. They perform local recovery and continue the program execution on a reduced set of resources. The checkpointing algorithms regularly save task descriptors explicitly, while the supervision algorithms exploit their natural duplication during work stealing and may be coupled with steal tracking to minimize the number of task re-executions. Thus far, the two groups of algorithms have been targeted at different task models: checkpointing algorithms at dynamic independent tasks, and supervision algorithms at nested fork-join programs. This paper transfers the most advanced supervision algorithm to the dynamic independent tasks model, thus enabling a comparison between checkpointing and supervision. Our comparison includes experiments and running time predictions. Results consistently show typical resilience overheads below 1% for both approaches. The overheads are lower for supervision in practically relevant cases, but checkpointing takes over for order millions of processes.

关键词： Fault Tolerance Resilience Work Stealing task-based parallel programming Runtime Systems

来源：评论

学校读者我要写书评

暂无评论

Exploiting task-based parallelism for parallel Discrete Event Simulation 26

Exploiting Task-Based Parallelism for Parallel Discrete Even...

引用

26th Euromicro International Conference on parallel, Distributed, and Network-based Processing (PDP)

作者： Wang, Yizhuo Gao, Zhiwei Ji, Weixing Zhang, Han Qing, Duzheng Beijing Inst Technol Sch Comp Sci Beijing Peoples R China Beijing Simulat Ctr Sci & Technol Special Syst Simulat Lab Beijing Peoples R China

ISBN: (纸本)9781538649756

Today large-scale simulation applications are becoming common in research and industry. A significant fraction of them run on multi-core clusters. Current parallel simulation kernels use multi-process and multi-thread to exploit inter-node parallelism and intra-node parallelism on multi-core clusters. We exploit task-base parallelism in parallel discrete event simulation (PDES) kernels, which is more fine-grained than thread-level and process-level parallelism. In our system, every simulation event is wrapped to a task. Work-stealing task scheduling scheme is applied to achieve dynamic load balancing among the multi-cores, and a graph partitioning approach is applied in partitioning simulation entities among the cluster nodes. Experimental results show that our PDES kernel outperforms existing PDES kernels by fully exploiting task parallelism.

关键词： parallel discrete event simulation multi-core cluster task-based parallel programming work-stealing

来源：评论

学校读者我要写书评

暂无评论

Professional parallel programming with C# 1

引用

2010年

作者： Gastón C. Hillar Gastón C. Hillar Gast N C Hillar

ISBN: (数字)9781118028124

ISBN: (纸本)9781118546031

Expert guidance for those programming today's dual-core processors PCs As PC processors explode from one or two to now eight processors, there is an urgent need for programmers to master concurrent programming. This book dives deep into the latest technologies available to programmers for creating professional parallel applications using C#, .NET 4, and Visual Studio 2010. The book covers task-based programming, coordination data structures, PLINQ, thread pools, asynchronous programming model, and more. It also teaches other parallel programming techniques, such as SIMD and vectorization. Teaches programmers professional-level, task-based, parallel programming with C#, .NET 4, and Visual Studio 2010 Covers concurrent collections, coordinated data structures, PLINQ, thread pools, asynchronous programming model, Visual Studio 2010 debugging, and parallel testing and tuning Explores vectorization, SIMD instructions, and additional parallel libraries Master the tools and technology you need to develop thread-safe concurrent applications for multi-core systems, with Professional parallel programming with C#.

关键词： Book on concurrent programming books on parallel programming programming guides C# programming guides how to do concurrent programming with C# task-based concurrent programming task-based parallel programming programming with C# programming with .NET 4 programming with Visual Studio 2010

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：