the proceedings contain 27 papers. the special focus in this conference is on Programming. the topics include: Exploring Type-Level Bisimilarity towards More Expressive Multiparty Session Types;verifying Visibility-Ba...
ISBN:
(纸本)9783030449131
the proceedings contain 27 papers. the special focus in this conference is on Programming. the topics include: Exploring Type-Level Bisimilarity towards More Expressive Multiparty Session Types;verifying Visibility-Based Weak Consistency;local Reasoning for Global Graph Properties;aneris: A Mechanised Logic for Modular Reasoning about Distributed Systems;continualization of Probabilistic Programs With Correction;semantic Foundations for Deterministic Dataflow and Stream Processing;connecting Higher-Order Separation Logic to a First-Order Outside World;modular Inference of Linear Types for Multiplicity-Annotated Arrows;Rusthorn: CHC-Based Verification for Rust Programs;a First-Order Logic with Frames;runners in Action;proving the Safety of Highly-Available Distributed Objects;solving Program Sketches with Large Integer Values;modular Relaxed Dependencies in Weak Memory Concurrency;ARMv8-A System Semantics: Instruction Fetch in Relaxed architectures;higher-Ranked Annotation Polymorphic Dependency Analysis;ConSORT: Context- and Flow-Sensitive Ownership Refinement Types for Imperative Programs;mixed Sessions;higher-Order Spreadsheets with Spilled Arrays;on the Versatility of Open Logical Relations: Continuity, Automatic Differentiation, and a Containment theorem;constructive Game Logic;optimal and Perfectly Parallel algorithms for On-demand Data-Flow Analysis;concise Read-Only Specifications for Better Synthesis of Programs with Pointers;soundness Conditions for Big-Step Semantics;liberate Abstract Garbage Collection from the Stack by Decomposing the Heap;SMT-Friendly Formalization of the Solidity Memory Model.
the proceedings contain 45 papers. the topics discussed include: randomized composable coresets for matching and vertex cover;almost optimal streaming algorithms for coverage problems;bicriteria distributed submodular...
ISBN:
(纸本)9781450345934
the proceedings contain 45 papers. the topics discussed include: randomized composable coresets for matching and vertex cover;almost optimal streaming algorithms for coverage problems;bicriteria distributed submodular maximization in a few rounds;on energy conservation in data centers;asymptotically optimal approximation algorithms for coflow scheduling;online flexible job scheduling for minimum span;minimizing total weighted flow time with calibrations;brief announcement: scheduling parallelizable jobs online to maximize throughput;brief announcement: a new improved bound for coflow scheduling;and a communication-avoiding parallel algorithm for the symmetric eigenvalue problem.
Discovering which code sections in a sequential program can be made to run in parallel is the first step in parallelizing it, and programmers routinely struggle in this step. Most of the current parallelism discovery ...
详细信息
ISBN:
(纸本)9781450345934
Discovering which code sections in a sequential program can be made to run in parallel is the first step in parallelizing it, and programmers routinely struggle in this step. Most of the current parallelism discovery techniques focus on specific language constructs while trying to identify such code sections. In contrast, we propose to concentrate on the computations performed by a program. In our approach, a program is treated as a collection of computations communicating with one another using a number of variables. Each computation is represented as a Computational Unit (CU). A CU contains the inputs and outputs of a computation, and the three phases of a computation: read, compute, and write. Based on the notion of CU, We present a unified framework to identify both loop and task parallelism in sequential programs.
the proceedings contain 16 papers. the topics discussed include: synthesis of full hardware implementation of RTOS-based systems;estimating the impact of architectural and software design choices on dynamic allocation...
ISBN:
(纸本)9781538675571
the proceedings contain 16 papers. the topics discussed include: synthesis of full hardware implementation of RTOS-based systems;estimating the impact of architectural and software design choices on dynamic allocation of heterogeneous memories;constraint specification language (CSL) use in the test program generator;rapid prototyping of parameterized rotated and cyclic Q delayed constellations demapper;assessment of a beaglebone black high sampling rate digital waveform generator;semantics-directed prototyping of hardware runtime monitors;prototyping energy harvesting powered systems with nonvolatile processor;rapid prototyping of embedded vision systems: embedding computer vision applications into low-power heterogeneous architectures;and ambient intelligence for the internet of things through context-awareness.
Embedded vision is a disruptive new technology in the vision industry. It is a revolutionary concept with far reaching implications, and it is opening up new applications and shaping the future of entire industries. I...
详细信息
the proceedings contain 52 papers. the topics discussed include: randomized approximate nearest neighbor search with limited adaptivity;encoding short ranges in TCAM without expansion: efficient algorithm and applicat...
ISBN:
(纸本)9781450342100
the proceedings contain 52 papers. the topics discussed include: randomized approximate nearest neighbor search with limited adaptivity;encoding short ranges in TCAM without expansion: efficient algorithm and applications;extending the nested parallel model to the nested dataflow model with provably efficient schedulers;latency-hiding work stealing: scheduling interacting parallel computations with work stealing;provably good and practically efficient parallel race detection for fork-join programs;dynamic determinacy race detection for task parallelism with futures;RUBIC: online parallelism tuning for co-located transactional memory applications;investigating the performance of hardware transactions on a multi-socket machine;parallel algorithms for asymmetric read-write costs;general profit scheduling and the power of migration on heterogeneous machines;the power of migration in online machine minimization;fair online scheduling for selfish jobs on heterogeneous machines;scheduling parallelizable jobs online to minimize the maximum flow time;clairvoyant dynamic bin packing for job scheduling with minimum server usage time;parallel equivalence class sorting: algorithms, lower bounds, and distribution-based analysis;and parallel approaches to the string matching problem on the GPU.
the proceedings contain 19 papers. the topics discussed include: energy consumption improvement of shared-cache multicore clusters based on explicit simultaneous multithreading;performance and energy analysis of OpenM...
ISBN:
(纸本)9781538648193
the proceedings contain 19 papers. the topics discussed include: energy consumption improvement of shared-cache multicore clusters based on explicit simultaneous multithreading;performance and energy analysis of OpenMP runtime systems with dense linear algebra algorithms;a case study of performance optimization in a heterogeneous environment;tuning up TVD HOPMOC method on Intel MIC Xeon Phi architectures with Intel parallel studio tools;comparing performance of C compilers optimizations on different multicore architectures;HPSM: a programming framework for multi-CPU and multi-GPU systems;assessing sparse triangular linear system solvers on GPUs;automatic partitioning of stencil computations on heterogeneous systems;strategies to improve the performance of a geophysics model for different Manycore systems;parallel algorithm for dynamic community detection;efficient in-situ quantum computing simulation of Shor's and Grover's algorithms;a parallel algorithm for minimum spanning tree on GPU;acceleration of cellular automata through parallel computing with OpenCL;a dataflow implementation of region growing method for cracks segmentation;automatic scan parallelization in OpenMP;impact of version management for transactional memories on phase-change memories;efficient Pathfinding co-processors for FPGAs;and a communication protocol for fog computing based on network coding applied to wireless sensors.
the proceedings contain 23 papers. the topics discussed include: extending OmpSs for OpenCL kernel co-execution in heterogeneous systems;data coherence analysis and optimization for heterogeneous computing;exploring h...
ISBN:
(纸本)9781509012336
the proceedings contain 23 papers. the topics discussed include: extending OmpSs for OpenCL kernel co-execution in heterogeneous systems;data coherence analysis and optimization for heterogeneous computing;exploring heterogeneous mobile architectures with a high-level programming model;scalability of CPU and GPU solutions of the prime elliptic curve discrete logarithm problem;overcoming memory-capacity constraints in the use of ILUPACK on graphics processors;exploiting data compression to mitigate aging in GPU register files;SEDEA: a sensible approach to account DRAM energy in multicore systems;a user-level scheduling framework for BoT applications on private clouds;GC-CR: a decentralized garbage collector component for checkpointing in clouds;towards a deterministic fine-grained task ordering using multi-versioned memory;FGSCM: a fine-grained approach to transactional lock elision;a machine learning approach for performance prediction and scheduling on heterogeneous CPUs;object placement for high bandwidth memory augmented with high capacity memory;accelerating graph analytics on CPU-FPGA heterogeneous platform;online multimedia similarity search with response time-aware parallelism and task granularity auto-tuning;a publish/subscribe system using causal broadcast over dynamically built spanning trees;global snapshot of a distributed system running on virtual machines;and resource-management study in HPC runtime-stacking context.
the proceedings contain 42 papers. the topics discussed include: sorting with asymmetric read and write costs;myths and misconceptions about threads;new streaming algorithms for parameterized maximal matching & be...
ISBN:
(纸本)9781450335881
the proceedings contain 42 papers. the topics discussed include: sorting with asymmetric read and write costs;myths and misconceptions about threads;new streaming algorithms for parameterized maximal matching & beyond;efficient approximation algorithms for computing k disjoint restricted shortest paths;fast and better distributed MapReduce algorithms for k-center clustering;fair adaptive parallelism for concurrent transactional memory applications;towards a universal approach for the finite departure problem in overlay networks;a compiler-runtime application binary interface for pipe-while loops;the Cilkprof scalability profiler;efficiently detecting races in Cilk programs that use reducer hyperobjects;threadScan: automatic and scalable memory reclamation;temporal fairness of round robin: competitive analysis for Lk-norms of flow time;and space and time efficient parallel graph decomposition, clustering, and diameter approximation.
this paper presents a parallel Motion Estimation (ME) solution for video coding on heterogeneous System-On-Chip (SoC), with two Implementation Versions: an OpenCL-based version targeting embedded GPGPUs and a hardware...
详细信息
this paper presents a parallel Motion Estimation (ME) solution for video coding on heterogeneous System-On-Chip (SoC), with two Implementation Versions: an OpenCL-based version targeting embedded GPGPUs and a hardware design targeting an embedded FPGA device. the current work considers a heterogeneous SoC composed of a variety distinct processing units such as CPU, DSP, Memory, GPGPU, and FPGA, where the FPGA component has support for dynamic reconfiguration. these two versions implement a parallelism-oriented algorithm and provide two performance/energy operation points allowing flexibility for dynamic power management according to runtime scenarios. the solution presented in this paper uses a scheme to reduce the number of operations required to perform the Sum of Absolute Differences (SAD) for the evaluated candidate blocks. this scheme is based on the accumulation of previously calculated SADs, considering the 8×8 Prediction Unities (PU) as base blocks, to generate the SAD for larger PUs. the proposed solution was evaluated in two platforms, (1) an Odroid XU-3, with a Samsung Exynos 5422 SoC, featuring a 64-core Mali-T628 MP6 GPGPU, and (2) an FPGA device. the performance and energy consumption results shows the FPGA implementation are able to process 49 HD 1080p fps with 1000× increased in energy efficiency when compared to the GPGPU implementation.
暂无评论