The proceedings contain 12 papers. The special focus in this conference is on Job Scheduling Strategies for parallelprocessing. The topics include: Optimization of Execution Parameters of Moldable Ultrasoun...
ISBN:
(纸本)9783031226977
The proceedings contain 12 papers. The special focus in this conference is on Job Scheduling Strategies for parallelprocessing. The topics include: Optimization of Execution Parameters of Moldable Ultrasound Workflows Under Incomplete Performance Data;Scheduling of Elastic Message Passing Applications on HPC Systems;preface;on the Feasibility of Simulation-Driven Portfolio Scheduling for Cyberinfrastructure Runtime Systems;Improving Accuracy of Walltime Estimates in PBS Professional Using Soft Walltimes;re-making the Movie-Making Machine;using Kubernetes in Academic Environment: Problems and Approaches;AI-Job Scheduling on Systems with Renewable Power Sources;Toward Building a Digital Twin of Job Scheduling and Power Management on an HPC System;encoding for Reinforcement Learning Driven Scheduling.
The proceedings contain 173 papers. The topics discussed include: portable implementation of advanced driver-assistance algorithms on heterogeneous architectures;improving CPU performance through dynamic GPU access th...
ISBN:
(纸本)9781538634080
The proceedings contain 173 papers. The topics discussed include: portable implementation of advanced driver-assistance algorithms on heterogeneous architectures;improving CPU performance through dynamic GPU access throttling in CPU-GPU heterogeneous processors;alternative processor within threshold: flexible scheduling on heterogeneous systems;preemptive resource management for dynamically arriving tasks in an oversubscribed heterogeneous computing system;modeling of applications and hardware to explore task mapping and scheduling strategies on a heterogeneous micro-server system;consumer-and-provider-oriented efficient IaaS resource allocation;a pipelined and scalable dataflow implementation of convolutional neural networks on FPGA;on-chip memory based binarized convolutional deep neural network applying batch normalization free technique on an FPGA;automatic flow selection and quality-of-result estimation for FPGA placement;exploiting decoupled OpenCL work-items with data dependencies on FPGAs: a case study;ReEP: a toolset for generation and programming of reconfigurable datapaths for event processing;a generic approach to the development of coprocessors for elliptic curve cryptosystems;a hardware acceleration for surface EMG non-negative matrix factorization;and on-FPGA real-time processing of biological signals from high-density MEAs: a design space exploration.
This paper describes two active learning strategies to teach and review fundamental PDC concepts in early computer science courses. Questions were created based on eight PDC concept categories. In the first phase, fla...
详细信息
ISBN:
(纸本)9798350311990
This paper describes two active learning strategies to teach and review fundamental PDC concepts in early computer science courses. Questions were created based on eight PDC concept categories. In the first phase, flashcards were created for students to review the concepts. In the second phase, a card game called PDC Quest was created to allow groups of students to engage collaboratively in learning and reviewing the concepts.
The proceedings contain 123 papers. The topics discussed include: challenges and opportunities in designing high-performance and scalable middleware for HPC and ai: past, present, and future;HTS: a threaded multilevel...
ISBN:
(纸本)9781665481069
The proceedings contain 123 papers. The topics discussed include: challenges and opportunities in designing high-performance and scalable middleware for HPC and ai: past, present, and future;HTS: a threaded multilevel sparse hybrid solver;a scalable adaptive-matrix SPMV for heterogeneous architectures;direct solution of larger coupled sparse/dense linear systems using low-rank compression on single-node multi-core machines in an industrial context;distributed-memory sparse kernels for machine learning;fam-graph: graph analytics on disaggregated memory;scalable multi-versioning ordered key-value stores with persistent memory support;in-memory indexed caching for distributed data processing;landau collision operator in the CUDA programming model applied to thermal quench plasmas;exploiting reduced precision for GPU-based time series mining;and MICCO: an enhanced multi-GPU scheduling framework for many-body correlation functions.
The presentation of Peachy parallel Assignments at parallel and distributed computing education workshops is an effort to promote the reuse of high-quality assignments, both saving precious faculty time and improving ...
详细信息
ISBN:
(纸本)9798350311990
The presentation of Peachy parallel Assignments at parallel and distributed computing education workshops is an effort to promote the reuse of high-quality assignments, both saving precious faculty time and improving the quality of course assignments. These assignments must have been used in class and are selected for being easy to adopt by other instructors and for being "cool and inspirational" so that students spend time on them and talk about them with others. The assignments and their materials are also archived on the Peachy parallel Assignments website. In this paper, we present two new assignments. The first has students implement the Mandelbrot set in Python, combining an interesting image with Python's ease of use. The second assignment is a substantial project to implement a programming contest judge. It requires that students use many parallel and distributed computing concepts, with the added benefit of solving a "real problem" and creating software with which students may have personally interacted.
This paper presents an algorithm for detecting attributed high-degree node isomorphism. High-degree isomorphic nodes seldom happen by chance and often represent duplicated entities or data processing errors. By defini...
详细信息
ISBN:
(纸本)9798350311990
This paper presents an algorithm for detecting attributed high-degree node isomorphism. High-degree isomorphic nodes seldom happen by chance and often represent duplicated entities or data processing errors. By definition, isomorphic nodes are topologically indistinguishable and can be problematic in graph ML tasks. The algorithm employs a parallel, "degree-bounded" approach that fingerprints each node's local properties through a hash, which constrains the search to nodes within hash-defined buckets, thus minimising the number of comparisons. This method scales on graphs with billions of nodes and edges. Finally, we provide isomorphic node oddities identified in real-world data.
Given the ubiquity of parallel computing hardware, we introduced parallelprogramming with pictures to the block-based Snap! environment and called it pSnap!, short for parallel Snap! We then created an accessible curr...
详细信息
ISBN:
(纸本)9798350311990
Given the ubiquity of parallel computing hardware, we introduced parallelprogramming with pictures to the block-based Snap! environment and called it pSnap!, short for parallel Snap! We then created an accessible curriculum for students of all ages to learn how to program serially and then how to program with explicit parallelism. This paper presents a new and innovative extension to our curriculum on parallel programming with pSnap!, one that broadens its appeal to the masses by teaching the application of parallel programming as a "choose your own learning adventure" activity, inspired by the Choose Your Own Adventure book series of the 1980s and 1990s. Specifically, after students learn the basics of parallel programming with pictures, they are ready to choose their next learning adventure, which applies their newfound parallel programming skills to create a video game of their choice, i.e., Missile Command or Do You Want to Build a Snowman?
Functional, memory-managed parallel languages (FMPLs) are a recent innovative approach to shared-memory parallel programming. Despite their rising prevalence in other areas, FMPLs have yet to gain traction in HPC. In ...
详细信息
ISBN:
(纸本)9798350311990
Functional, memory-managed parallel languages (FMPLs) are a recent innovative approach to shared-memory parallel programming. Despite their rising prevalence in other areas, FMPLs have yet to gain traction in HPC. In this work, we explore the utility of FMPLs for HPC by re-implementing the NAS parallel Benchmarks in an FMPL. For this study, we ported the benchmarks into the parallel ML language. We discuss the advantages and disadvantages of using parallel ML for HPC applications based on our development experience. We compare the performance of our parallel ML implementation to the existing C/OpenMP version. The FMPL implementations are 1.02x-5.76x slower compared to OpenMP. Our positive development experience combined with some competitive performance results suggest that FMPLs have the potential to become a viable choice for HPC applications. We conclude by describing our future work to automatically manage distributed memory within an FMPL, creating a compelling new programming model for HPC.
Fractal-based decomposition is a flexible framework representing a family of optimization algorithms based on a hierarchical decomposition of the search space. We have built a software called Zellij, in which we were ...
详细信息
ISBN:
(纸本)9798350311990
Fractal-based decomposition is a flexible framework representing a family of optimization algorithms based on a hierarchical decomposition of the search space. We have built a software called Zellij, in which we were able to instantiate popular decomposition-based algorithms. Our goal is to tackle optimization problems characterized by computationally expensive objective functions and high dimensional search space. In this paper, we propose a generic asynchronous parallel methodology of fractal-based optimization algorithms on multi-nodes and multi-CPUs distributed environments. Experimental results show a significantly reduced computation time between the mono-threaded version and the asynchronous one. The obtained results are also analyzed according to the various search components such as tree search, exploration, and exploitation strategies.
暂无评论