The proceedings contain 8 papers. The topics discussed include: extreme computing for extreme adaptive optics: the key to finding life outside our solar system;the CLAW DSL: abstractions for performance portable weath...
ISBN:
(纸本)9781450358910
The proceedings contain 8 papers. The topics discussed include: extreme computing for extreme adaptive optics: the key to finding life outside our solar system;the CLAW DSL: abstractions for performance portable weather and climate models;a parallel solver for graph Laplacians;abstractions and directives for adapting wavefront algorithms to future architectures;distributed, shared-memory parallel triangle counting;MRG8 random number generation for the Exascale era;a massively parallel algorithm for the approximate calculation of inverse P-TH roots of large sparse matrices;and balanced graph partition refinement using the graph p-Laplacian.
The proceedings contain 17 papers. The topics discussed include: accelerating high-resolution weather models with deep-learning hardware;preconditioning nonlinear conjugate gradient with diagonalized quasi-Newton;towa...
ISBN:
(纸本)9781450367707
The proceedings contain 17 papers. The topics discussed include: accelerating high-resolution weather models with deep-learning hardware;preconditioning nonlinear conjugate gradient with diagonalized quasi-Newton;towards data-driven dynamic surrogate models for ocean flow;a discontinuous Galerkin fast spectral method for multi-species full Boltzmann on streaming multi-processors;numerical simulation of the quantum cascade laser dynamics on parallel architectures;analytical paw projector functions for reduced bandwidth requirements;towards continuous benchmarking: an automated performance evaluation framework for high performance software;and prediction of time-to-solution in material science simulations using deep learning.
The proceedings contain 16 papers. The topics discussed include: automatic generation of efficient linear algebra programs;extreme-scale task-based Cholesky factorization toward climate and weather prediction applicat...
ISBN:
(纸本)9781450379939
The proceedings contain 16 papers. The topics discussed include: automatic generation of efficient linear algebra programs;extreme-scale task-based Cholesky factorization toward climate and weather prediction applications;benchmarking of state-of-the-art HPC clusters with a production CFD code;evaluating the influence of hemorheo-logical parameters on circulating tumor cell trajectory and simulation time;load balancing in large scale Bayesian inference;deploying scientific AI networks at petaflop scale on secure large scale HPC production systems with containers;hardware locality-aware partitioning and dynamic load-balancing of unstructured meshes for large-scale scientific applications;performance evaluation of a two-dimensional flood model on heterogeneous high-performance computing architectures;urgent supercomputing of earthquakes: use case for civil protection;and k-dispatch: a workflow management system for the automated execution of biomedical ultrasound simulations on remote computing resources.
The proceedings contain 17 papers. The topics discussed include: ensuring statistical reproducibility of ocean model simulations in the age of hybrid computing;simulation of droplet dispersion in COVID-19 type pandemi...
ISBN:
(纸本)9781450385633
The proceedings contain 17 papers. The topics discussed include: ensuring statistical reproducibility of ocean model simulations in the age of hybrid computing;simulation of droplet dispersion in COVID-19 type pandemics on Fugaku;refactoring the MPS/university of Chicago radiative MHD (MURaM) model for GPU/CPU performance portability using OpenACC directives;stream-AI-MD: streaming AI-driven adaptive molecular simulations for heterogeneous computingplatforms;performance optimization and load-balancing modeling for superparametrization by 3D LES;progress towards accelerating the unified model on hybrid multi-core systems;in-situ assessment of device-side compute work for dynamic load balancing in a GPU-accelerated PIC code;and fast scalable implicit solver with convergence of equation-based modeling and data-driven learning: earthquake city simulation on low-order unstructured finite element.
The proceedings contain 17 papers. The topics discussed include: communication bounds for convolutional neural networks;reducing communication in the conjugate gradient method: a case study on high-order finite elemen...
ISBN:
(纸本)9781450394109
The proceedings contain 17 papers. The topics discussed include: communication bounds for convolutional neural networks;reducing communication in the conjugate gradient method: a case study on high-order finite elements;MINT: a library that conserves vorticity or lateral fluxes when interpolating vector fields;towards data-driven inference of stencils for discrete differential operators;a task programming implementation for the particle in cell code SMILEI;flow field prediction on large variable sized 2D point clouds with graph convolution;applications of flexible spatial and temporal discretization techniques to a numerical weather prediction model;scalable low-rank factorization using a task-based runtime system with distributed memory;and ChASE - a distributed hybrid CPU-GPU eigensolver for large-scale Hermitian eigenvalue problems.
The proceedings contain 26 papers. The topics discussed include: causal discovery and optimal experimental design for genome-scale biological network recovery;scaling resolution of gigapixel whole slide images using s...
ISBN:
(纸本)9798400701900
The proceedings contain 26 papers. The topics discussed include: causal discovery and optimal experimental design for genome-scale biological network recovery;scaling resolution of gigapixel whole slide images using spatial decomposition on convolutional neural networks;longitudinal effects on plant species involved in agriculture and pandemic emergence undergoing changes in abiotic stress;lessons learned from a performance analysis and optimization of a multiscale cellular simulation;streaming generalized canonical polyadic tensor decompositions;scalable multi-FPGA design of a discontinuous Galerkin shallow-water model on unstructured meshes;hardware-agnostic interactive exascale in situ visualization of particle-in-cell simulations;exploiting symmetries for preconditioning Poisson’s Equation in CFD simulations;FourCastNet: accelerating global high-resolution weather forecasting using adaptive Fourier neural operators;and a massively parallel multi-scale FE2 Framework for multi-trillion degrees of freedom simulations.
The proceedings contain 13 papers. The topics discussed include: towards the virtual rheometer: high performance computing for the red blood cell microstructure;a computational framework to assess the influence of cha...
ISBN:
(纸本)9781450350624
The proceedings contain 13 papers. The topics discussed include: towards the virtual rheometer: high performance computing for the red blood cell microstructure;a computational framework to assess the influence of changes in vascular geometry on blood flow;increasing the efficiency of sparse matrix-matrix multiplication with a 2.5D algorithm and one-sided MPI;evaluation of a directive-based GPU programming approach for high-order unstructured mesh computational fluid dynamics;asynchronous task-based parallelization of algebraic multigrid;scheduling finite difference approximations for dag-modeled large scale applications;ABCPY: a user-friendly, extensible, and parallel library for approximate Bayesian computation;and parallelized dimensional decomposition for large-scale dynamic stochastic econom ic models.
The proceedings contain 13 papers. The topics discussed include: automatic global multiscale seismic inversion: insights into model, data, and workflow management;SWIFT: using task-based parallelism, fully asynchronou...
ISBN:
(纸本)9781450341264
The proceedings contain 13 papers. The topics discussed include: automatic global multiscale seismic inversion: insights into model, data, and workflow management;SWIFT: using task-based parallelism, fully asynchronous communication, and graph partition-based domain decomposition for strong scaling on more than 100,000 cores;performance analysis and optimization of nonhydrostatic icosahedral atmospheric model (NICAM) on the K-computer and TSUBAME2.5;approximate Bayesian computation for granular and molecular dynamics simulations;extreme-scale multigrid components within PETSc;on the robustness and prospects of adaptive BDDC methods for finite element discretizations of elliptic PDEs with high-contrast coefficients;massively parallel hybrid total FETI (HTFETI) solver;an efficient compressible multicomponent flow solver for heterogeneous CPU/GPU architectures;adaptive optics simulation for the world's largest telescope on multicore architectures with multiple GPUs;benefits of SMT and of parallel transpose algorithm for the large-scale GYSELA application;a generic C++ library for multilevel quasi-Monte Carlo;and context matters: distributed graph algorithms and runtime systems.
Based on the research of HDFS architecture, a solution for a Hadoop based technology resource big data platform is proposed. This paper introudces the overall framework, design and implementation, and technology resou...
详细信息
We develop a user-friendly, high-performance Material Point Method (MPM) solver, ***, using the Julia language. The dual storage and interaction between material particles and the background grid limits the use of hig...
详细信息
We develop a user-friendly, high-performance Material Point Method (MPM) solver, ***, using the Julia language. The dual storage and interaction between material particles and the background grid limits the use of high-resolution models, and few solutions offer both performance and ease of use. Our backend-agnostic solver leverages GPU computing, enabling high-performance implementations across various platforms with a single codebase. We propose an effective memory throughput approach to evaluate GPU efficiency. The accuracy and efficiency of the solver are validated by four numerical examples, showing a 3.5x speedup on a single-threaded CPU and a 19.4x speedup on a 20-threaded CPU compared to optimized vectorized MATLAB code. Among four tested GPUs, it achieves a maximum average memory bandwidth utilization of 67%, reaching up to 78% (1453 GiB/s) in 2D tests. On the GH200 platform, the GPU shows a 21.3x speedup over its multi-threaded CPU. Our solver can handle approximately 19.7 million double-precision material particles on a consumer-grade GPU with 10 GB memory, while on the GH200, it can manage around 190 million material particles. This study introduces advanced software capabilities along with backend-agnostic framework expertise to the field of geotechnical engineering. By utilizing parallel algorithms specifically designed for MPM and integrating a newly developed performance evaluation approach, it ensures rapid prototyping and seamless transition to production. This implementation empowers researchers and practitioners to promote the widespread adoption of high-resolution MPM models in geotechnical engineering.
暂无评论