the proceedings contains 21 papers from the Fifthacmsigplansymposium on principles & practice of parallelprogramming PPOPP. Topics discussed include data parallel programs;data libraries;data caches;data acces...
详细信息
the proceedings contains 21 papers from the Fifthacmsigplansymposium on principles & practice of parallelprogramming PPOPP. Topics discussed include data parallel programs;data libraries;data caches;data access;distributed and shared memory multiprocessors;dataflow analysis;scheduling;optimization;and synchronization.
the proceedings contain 49 papers. the topics discussed include: Semi-StructMG: a fast and scalable semi-structured algebraic multigrid;LibRTS: a spatial indexing library by ray tracing;high-performance visual semanti...
ISBN:
(纸本)9798400714436
the proceedings contain 49 papers. the topics discussed include: Semi-StructMG: a fast and scalable semi-structured algebraic multigrid;LibRTS: a spatial indexing library by ray tracing;high-performance visual semantics compression for AI-driven science;COMPSO: optimizing gradient compression for distributed training with second-order optimizers;TurboFFT: co-designed high-performance and fault-tolerant fast Fourier transform on GPUs;Helios: efficient distributed dynamic graph sampling for online GNN inference;triangle counting on tensor cores;AC-Cache: a memory-efficient caching system for small objects via exploiting access correlations;magneto: accelerating parallel structures in DNNsvia co-optimization of operators;and FlashSparse: minimizing computation redundancy for fast sparse matrix multiplications on tensor cores.
the proceedings contains 25 papers. Topics discussed include data and task parallelism, irregular applications, coherence protocols, shared memory, compilers and performances issue.
the proceedings contains 25 papers. Topics discussed include data and task parallelism, irregular applications, coherence protocols, shared memory, compilers and performances issue.
the symposium materials contain 26 papers covering the spectrum from models of parallel computing to implementation techniques, and from compilation algorithms to application development tools and case studies, thus s...
详细信息
ISBN:
(纸本)0897915895
the symposium materials contain 26 papers covering the spectrum from models of parallel computing to implementation techniques, and from compilation algorithms to application development tools and case studies, thus satisfying the goal of broadly covering the active areas of parallelprogramming research.
the proceedings contain 46 papers. the topics discussed include: stream processing with dependency-guided synchronization;mashup: making serverless computing useful for HPC workflows via hybrid execution;parallel bloc...
ISBN:
(纸本)9781450392044
the proceedings contain 46 papers. the topics discussed include: stream processing with dependency-guided synchronization;mashup: making serverless computing useful for HPC workflows via hybrid execution;parallel block-delayed sequences;near-optimal sparse Allreduce for distributed deep learning;Vapro: performance variance detection and diagnosis for production-run parallel applications;interference relation-guided SMT solving for multi-threaded program verification;extending the limit of molecular dynamics with ab initio accuracy to 10 billion atoms;scaling graph traversal to 281 trillion edges with 40 million cores;asymmetry-aware scalable locking;the performance power of software combining in persistence;and multi-queues can be state-of-the-art priority schedulers.
the proceedings contain 48 papers. the topics discussed include: efficient algorithms for persistent transactional memory;investigating the semantics of futures in transactional memory systems;constant-time snapshots ...
ISBN:
(纸本)9781450382946
the proceedings contain 48 papers. the topics discussed include: efficient algorithms for persistent transactional memory;investigating the semantics of futures in transactional memory systems;constant-time snapshots with applications to concurrent data structures;reasoning about recursive tree traversals;synthesizing optimal collective algorithms;scaling implicit parallelism via dynamic control replication;efficiently reclaiming memory in concurrent search data structures while bounding wasted memory;are dynamic memory managers on GPUs slow? a survey and benchmarks;improving communication by optimizing on-node data movement with data layout;and Sparta: high-performance, element-wise sparse tensor contraction on heterogeneous memory.
the proceedings contain 43 papers. the topics discussed include: provably good randomized strategies for data placement in distributed key-value stores;provably fast and space-efficient parallel biconnectivity;practic...
ISBN:
(纸本)9798400700156
the proceedings contain 43 papers. the topics discussed include: provably good randomized strategies for data placement in distributed key-value stores;provably fast and space-efficient parallel biconnectivity;practically and theoretically efficient garbage collection for multiversioning;fast and scalable channels in Kotlin coroutines;high-performance GPU-to-CPU transpilation and optimization via high-level parallel constructs;lifetime-based optimization for simulating quantum circuits on a new Sunway supercomputer;merchandiser: data placement on heterogeneous memory for task-parallel HPC applications with load-balance awareness;visibility algorithms for dynamic dependence analysis and distributed coherence;Block-STM: scaling blockchain execution by turning ordering curse to a performance blessing;TDC: towards extremely efficient CNNs on GPUs via hardware-aware tucker decomposition;and improving energy saving of one-sided matrix decompositions on CPU-GPU heterogeneous systems.
the proceedings contain 46 papers. the topics discussed include: kite: efficient and available release consistency for the datacenter;Oak: a scalable off-heap allocated key-value map;optimizing batched Winograd convol...
the proceedings contain 46 papers. the topics discussed include: kite: efficient and available release consistency for the datacenter;Oak: a scalable off-heap allocated key-value map;optimizing batched Winograd convolution on GPUs;taming unbalanced training workloads in deep learning with partial collective operations;scalable top-K retrieval with Sparta;waveSZ: a hardware-algorithm co-design of efficient lossy compression for scientific data;scaling concurrent queues by using HTM to profit from failed atomic operations;a wait-free universal construction for large objects;using sample-based time series data for automated diagnosis of scalability losses in parallel programs;scaling out speculative execution of finite-state machines withparallel merge;and detecting and reproducing error-code propagation bugs in MPI implementations.
the proceedings contain 44 papers. the topics discussed include: FastFold: optimizing AlphaFold training and inference on GPU clusters;liger: interleaving intra- and inter-operator parallelism for distributed large mo...
ISBN:
(纸本)9798400704352
the proceedings contain 44 papers. the topics discussed include: FastFold: optimizing AlphaFold training and inference on GPU clusters;liger: interleaving intra- and inter-operator parallelism for distributed large model inference;optimizing collective communications with error-bounded lossy compression for GPU clusters;OsirisBFT: say no to task replication for scalable byzantine fault tolerant analytics;RELAX: durable data structures with swift recovery;a row decomposition-based approach for sparse matrix multiplication on GPUs;Tetris: accelerating sparse convolution by exploiting memory reuse on GPU;scaling up transactions with slower clocks;towards scalable unstructured mesh computations on shared memory many-cores;AGAthA: fast and efficient GPU acceleration of guided sequence alignment for long read mapping;and shared memory-contention-aware concurrent DNN execution for diversely heterogeneous system-on-chips.
the proceedings contain 46 papers. the topics discussed include: kite: efficient and available release consistency for the datacenter;oak: a scalable off-heap allocated key-value map;taming unbalanced training workloa...
ISBN:
(纸本)9781450368186
the proceedings contain 46 papers. the topics discussed include: kite: efficient and available release consistency for the datacenter;oak: a scalable off-heap allocated key-value map;taming unbalanced training workloads in deep learning with partial collective operations;scalable top-k retrieval with sparta;waveSZ: a hardware-algorithm co-design of efficient lossy compression for scientific data;scaling concurrent queues by using HTM to profit from failed atomic operations;a wait-free universal construction for large objects;universal wait-free memory reclamation;and using sample-based time series data for automated diagnosis of scalability losses in parallel programs.
暂无评论