检索结果-内蒙古大学图书馆

IEEE International Symposium on parallel and Distributed Processing Workshops and Phd Forum (IPDPSW)

作者： Samuel Collinson Allan Bai Oliver Sinnen Department of Electrical Computer and Software Engineering Parallel and Reconfigurable Computing Lab University of Auckland New Zealand

ISBN: (数字)9798350364606

ISBN: (纸本)9798350364613

Priority queues (PQ) are an essential data structure for many important algorithms. Hardware implementations of priority queues can accelerate such algorithms significantly. Usually a choice has to be made between very fast (fixed-size) hardware PQs or scalable PQs. Fast hardware queues can provide single cycle push and pop operations, but are limited in size. Scalable hardware queues use embedded memory to achieve scalability, but thereby compromise on speed. This paper proposes a fast and scalable hardware priority queue that can be used for general purpose. It is based on a modular and flexible hybrid design, using a shift register queue combined with a heap-based queue. In the best case, i.e. when only the first queue is needed, push and pop operations only take a single cycle. In an experimental evaluation on a Xilinx FPGA we demonstrate that performance of the hybrid queue maintains a good balance between performance and scalability even for applications where the size of the working set of data can have large variance. We further propose an optimisation of the heap-based queue to support workloads that repeat several consecutive push operations followed by a single pop operation, which is the workload, for instance, in state space search or ray-tracing.

关键词： Distributed processing Scalability Sociology Ray tracing Shift registers Data structures Hardware

来源：评论

学校读者我要写书评

暂无评论

A Guaranteed Approximation Algorithm for Scheduling Fork-Joins with Communication Delay

A Guaranteed Approximation Algorithm for Scheduling Fork-Joi...

引用

International Symposium on parallel and Distributed Processing (IPDPS)

作者： Pierre-François Dutot Yeu-Shin Fu Nikhil Prasad Oliver Sinnen CNRS Inria Grenoble INP* LIG Univ. Grenoble Alpes Grenoble France Dept. of Electrical Computer & Software Engineering Parallel and Reconfigurable Computing Lab University of Auckland Auckland New Zealand

Scheduling task graphs with communication delay is a widely studied NP-hard problem. Many heuristics have been proposed, but there is no constant approximation algorithm for this classic model. In this paper, we focus on the scheduling of the important class of fork-join task graphs (describing many types of common computations) on homogeneous processors. For this sub-case, we propose a guaranteed algorithm with a $\left( {1 + \frac{m}{{m - 1}}} \right)$-approximation factor, where m is the number of processors. The algorithm is not only the first constant approximation for an important sub-domain of the classic scheduling problem, it is also a practical algorithm that can obtain shorter makespans than known heuristics. To demonstrate this, we propose adaptations of known scheduling heuristic for the specific fork-join structure. In an extensive evaluation, we then implemented these algorithms and scheduled many fork-join graphs with up to thousands of tasks and various computation time distributions on up to hundreds of processors. Comparing the obtained results demonstrates the competitive nature of the proposed approximation algorithm.

关键词：

来源：评论

学校读者我要写书评

暂无评论

SkyCastle: A Resource-Aware Multi-Loop Scheduler for High-Level Synthesis

SkyCastle: A Resource-Aware Multi-Loop Scheduler for High-Le...

引用

IEEE International Conference on Field-Programmable Technology (FPT)

作者： Julian Oppermann Lukas Sommer Lukas Weber Melanie Reuter-Oppermann Andreas Koch Oliver Sinnen Embedded Systems and Applications Group Technische Universität Darmstadt Germany Discrete Optimization and Logistics Group Karlsruhe Institute of Technology Germany Parallel and Reconfigurable Computing Lab University of Auckland New Zealand

A common optimisation problem in the high-level synthesis (HLS) of FPGA-based accelerators is to find a microarchitecture that maximises the performance while keeping the utilisation of the device's low-level resources below certain limits. We propose to tackle it directly as part of the HLS scheduler. To that end, we formalise a general, integrated scheduling and allocation problem for HLS kernels, and present SkyCastle, a novel resource-aware multi-loop scheduler using integer linear programming to solve it for a subclass of kernels composed of multiple, nested loops. In order to demonstrate the practical applicability of the approach, we model the scheduler in such a way as to be plug-in compatible with the Xilinx Vivado HLS engine, allowing the computed solutions to be fed back into its synthesis flow. We evaluate SkyCastle for three non-trivial kernels from the machine learning, signal processing, and physical simulation domains, on two FPGA devices. Additionally, we investigate the replication of slightly slower, but smaller accelerators as a means to further boost the overall performance. In contrast to Vivado HLS' default settings, which aim at maximum performance but may fail in later synthesis steps, the solutions computed by our scheduler always result in synthesisable designs.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Work in Progress: GeMS: A Generator for Modulo Scheduling Problems

Work in Progress: GeMS: A Generator for Modulo Scheduling Pr...

引用

Proceedings of the International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES)

作者： Julian Oppermann Sebastian Vollbrecht Melanie Reuter-Oppermann Oliver Sinnen Andreas Koch Embedded Systems and Applications Group Technische Universität Darmstadt Germany Discrete Optimization and Logistics Group Karlsruhe Service Research Institute Karlsruhe Institute of Technology Germany Parallel and Reconfigurable Computing Lab University of Auckland New Zealand

GeMS is a customisable, open-source toolkit for generating random, yet constrained, modulo scheduling problems with a known optimal initiation interval. These can then be used to evaluate the behavior of different sch... 详细信息

关键词： Processor scheduling Schedules Generators Kernel Computer architecture Runtime Embedded systems

来源：评论

学校读者我要写书评

暂无评论

RedLib: Nestable Reductions for Collections in Java

RedLib: Nestable Reductions for Collections in Java

引用

IEEE International Conference on High Performance computing and Communications

作者： Mostafa Mehrabi Xing Fan Nasser Giacaman Oliver Sinnen Department of Electrical and Computer Engineering The University of Auckland Parallel and Reconfigurable Computing Lab

ISBN: (纸本)9781509042982

A reduction is a parallel programming mechanism for combining two or more elements into one. Many parallel programming languages, tools and frameworks (e.g., OpenMP, MPI, etc.) directly support simple forms of reductions (e.g., building a total sum out of partial sums). Some of those tools and frameworks allow more complex reductions to be implemented as custom reductions. However, the success of network-based application frameworks like Hadoop have shown that there is a strong need for reductions of aggregate data structures, such as the union of sets or maps. Usually parallel programming frameworks on shared-memory systems do not support these types of complex reductions directly, and a user needs to implement them manually. To address the gap thereof, this paper proposes an object-oriented reduction framework that supports reductions of aggregate types, and proposes the nesting of reduction objects for flexible extensions of reduction operations. Based on the proposed framework, a reduction library (RedLib) has been developed for Java with direct support for many common reduction operations on collections and maps. Furthermore, the paper studies the usage of the framework for common and complex cases and evaluates its performance, based on operations found in standard benchmarks.

关键词： object oriented reductions shared memory reductions nestable reductions aggregate reductions collections maps :

来源：评论

学校读者我要写书评

暂无评论

Pulsar Searches with the SKA

引用

Proceedings of the International Astronomical Union 2017年第S337期13卷 171-174页

作者： Levin, L. Armour, W. Baffa, C. Barr, E. Cooper, S. Eatough, R. Ensor, A. Giani, E. Karastergiou, A. Karuppusamy, R. Keith, M. Kramer, M. Lyon, R. Mackintosh, M. Mickaliger, M. Van Nieuwpoort, R Pearson, M. Prabu, T. Roy, J. Sinnen, O. Spitler, L. Spreeuw, H. Stappers, B.W. Van Straten, W. Williams, C. Wang, H. Wiesner, K. Jodrell Bank Centre for Astrophysics University of Manchester Oxford Road Manchester M13 9PL United Kingdom University of Oxford Denys Wilkinson Building Keble Road Oxford OX1 3RH United Kingdom INAF Osservatorio Astrofisico di Arcetri Largo E. Fermi 5 Firenze 50125 Italy Max-Planck-Institut fur Radioastronomie Auf dem Hügel 69 Bonn D-53121 Germany Institute for Radio Astronomy and Space Research Auckland University of Technology Private Bag 92006 Auckland 1142 New Zealand Science and Technology Facilities Council Polaris House North Star Avenue Swindon SN2 1SZ United Kingdom Netherlands Institute for Radio Astronomy (ASTRON) Postbus 2 Dwingeloo NL-7990 AA Netherlands NCRA-TIFR Pune University Campus Pune 411007 India Parallel and Reconfigurable Computing (PARC) Lab University of Auckland Private Bag 92019 Auckland 1142 New Zealand

The Square Kilometre Array will be an amazing instrument for pulsar astronomy. While the full SKA will be sensitive enough to detect all pulsars in the Galaxy visible from Earth, already with SKA1, pulsar searches will discover enough pulsars to increase the currently known population by a factor of four, no doubt including a range of amazing unknown sources. Real time processing is needed to deal with the 60 PB of pulsar search data collected per day, using a signal processing pipeline required to perform more than 10 POps. Here we present the suggested design of the pulsar search engine for the SKA and discuss challenges and solutions to the pulsar search venture. © 2018 International Astronomical Union.

关键词： (stars:) pulsars: general methods: data analysis telescopes

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：