Molecular dynamics simulation emerges as an important area that HPC+AI helps to investigate the physical properties, with machine-learning interatomic potentials (MLIPs) being used. General-purpose machine-learning (M...
详细信息
ISBN:
(纸本)9798400714436
Molecular dynamics simulation emerges as an important area that HPC+AI helps to investigate the physical properties, with machine-learning interatomic potentials (MLIPs) being used. General-purpose machine-learning (ML) tools have been leveraged in MLIPs, but they are not perfectly matched with each other, since many optimization opportunities in MLIPs have been missed by ML tools. this inefficiency arises from the fact that HPC+AI applications work with far more computational complexity compared with pure AI scenarios. this paper has developed an MLIP, named TensorMD, independently from any ML tool. TensorMD has been evaluated on two supercomputers and scaled to 51.8 billion atoms, i.e., similar to 3x compared with state-of-the-art.
the proceedings contains 21 papers from the Fifthacm SIGPLAN symposium on principles & practice of parallelprogramming PPOPP. Topics discussed include data parallel programs;data libraries;data caches;data acces...
详细信息
the proceedings contains 21 papers from the Fifthacm SIGPLAN symposium on principles & practice of parallelprogramming PPOPP. Topics discussed include data parallel programs;data libraries;data caches;data access;distributed and shared memory multiprocessors;dataflow analysis;scheduling;optimization;and synchronization.
the proceedings contains 25 papers. Topics discussed include data and task parallelism, irregular applications, coherence protocols, shared memory, compilers and performances issue.
the proceedings contains 25 papers. Topics discussed include data and task parallelism, irregular applications, coherence protocols, shared memory, compilers and performances issue.
the proceedings contain 42 papers. the topics discussed include: automatic data movement and computation mapping for multi-level parallel architectures with explicitly managed memories;type inference for locality anal...
ISBN:
(纸本)9781595939609
the proceedings contain 42 papers. the topics discussed include: automatic data movement and computation mapping for multi-level parallel architectures with explicitly managed memories;type inference for locality analysis of distributed data structures;quasi-static scheduling for safe futures;scalable packet classification using interpreting: a cross-platform multi-core solution;FastForward for efficient pipeline parallelism: a cache-optimized concurrent lock-free queue;matrix product on heterogeneous master-worker platforms;high performance dense linear algebra on a spatially distributed processor;optimization principles and application performance evaluation of a multithreaded GPU using CUDA;a case study in SIMD text processing withparallel bit streams: UTF-8 to UTF-16 transcoding;programming with tiles;design and implementation of a high-performance MPI for C# and the common language infrastructure;and a portable runtime interface for multi-level memory hierarchies.
the symposium materials contain 26 papers covering the spectrum from models of parallel computing to implementation techniques, and from compilation algorithms to application development tools and case studies, thus s...
详细信息
ISBN:
(纸本)0897915895
the symposium materials contain 26 papers covering the spectrum from models of parallel computing to implementation techniques, and from compilation algorithms to application development tools and case studies, thus satisfying the goal of broadly covering the active areas of parallelprogramming research.
the proceedings contains 14 papers from the conference on the Proceedings of the acm SIGPLAN symposium on principles and practice of parallelprogramming, PPOPP. Topics discussed include: reference idempotency analysi...
详细信息
the proceedings contains 14 papers from the conference on the Proceedings of the acm SIGPLAN symposium on principles and practice of parallelprogramming, PPOPP. Topics discussed include: reference idempotency analysis: a framework for optimizing speculative execution;pointer and escape analysis for multithread programs;language support for motion-order matrices;efficient load balancing for wide-area divide-and-conquer applications;scalable queue-based spin locks with timeout;contention ellimination by replication of sequential sections in distributed shared memory programs;and accurate data redistribution cost estimation in software distributes shared memory systems.
the proceedings contain 26 papers. the topics discussed include: LogP: towards a realistic model of parallel computation;exploiting task and data parallelism on a multicomputer;ActorSpace: an open distributed programm...
ISBN:
(纸本)0897915895
the proceedings contain 26 papers. the topics discussed include: LogP: towards a realistic model of parallel computation;exploiting task and data parallelism on a multicomputer;ActorSpace: an open distributed programming paradigm;experiences using the ParaScope editor: an interactive parallelprogramming tool;perturbation analysis of high level instrumentation for SPMD programs;integrating message-passing and shared-memory: early experience;using scheduler information to achieve optimal barrier synchronization performance;and a concurrent copying garbage collector for languages that distinguish (im)mutable data.
the proceedings contain 46 papers. the topics discussed include: stream processing with dependency-guided synchronization;mashup: making serverless computing useful for HPC workflows via hybrid execution;parallel bloc...
ISBN:
(纸本)9781450392044
the proceedings contain 46 papers. the topics discussed include: stream processing with dependency-guided synchronization;mashup: making serverless computing useful for HPC workflows via hybrid execution;parallel block-delayed sequences;near-optimal sparse Allreduce for distributed deep learning;Vapro: performance variance detection and diagnosis for production-run parallel applications;interference relation-guided SMT solving for multi-threaded program verification;extending the limit of molecular dynamics with ab initio accuracy to 10 billion atoms;scaling graph traversal to 281 trillion edges with 40 million cores;asymmetry-aware scalable locking;the performance power of software combining in persistence;and multi-queues can be state-of-the-art priority schedulers.
the proceedings contain 43 papers. the topics discussed include: predator: predictive false sharing detection;concurrency testing using schedule bounding: an empirical study;trace driven dynamic deadlock detection and...
详细信息
ISBN:
(纸本)9781450326568
the proceedings contain 43 papers. the topics discussed include: predator: predictive false sharing detection;concurrency testing using schedule bounding: an empirical study;trace driven dynamic deadlock detection and reproduction;efficient search for inputs causing high floating-point errors;portable, MPI-interoperable coarray Fortran;eliminating global interpreter locks in ruby through hardware transactional memory;leveraging hardware message passing for efficient thread synchronization;well-structured futures and cache locality;time-warp: lightweight abort minimization in transactional memory;beyond parallelprogramming with domain specific languages;a decomposition for in-place matrix transposition;in-place transposition of rectangular matrices on accelerators;and parallelizing dynamic programmingthrough rank convergence.
the proceedings contain 48 papers. the topics discussed include: efficient algorithms for persistent transactional memory;investigating the semantics of futures in transactional memory systems;constant-time snapshots ...
ISBN:
(纸本)9781450382946
the proceedings contain 48 papers. the topics discussed include: efficient algorithms for persistent transactional memory;investigating the semantics of futures in transactional memory systems;constant-time snapshots with applications to concurrent data structures;reasoning about recursive tree traversals;synthesizing optimal collective algorithms;scaling implicit parallelism via dynamic control replication;efficiently reclaiming memory in concurrent search data structures while bounding wasted memory;are dynamic memory managers on GPUs slow? a survey and benchmarks;improving communication by optimizing on-node data movement with data layout;and Sparta: high-performance, element-wise sparse tensor contraction on heterogeneous memory.
暂无评论