simulations of reacting multiphase flows tend to display an inhomogeneously distributed computational intensity over the spatial and temporal domains. The time-to-solution of chemical reaction rates can span multiple ...
详细信息
simulations of reacting multiphase flows tend to display an inhomogeneously distributed computational intensity over the spatial and temporal domains. The time-to-solution of chemical reaction rates can span multiple orders of magnitude due to the emergence of combustible kernels and thin turbulent reaction zones. Similarly, the time to solve the equation of state (EoS) for non-ideal fluid mixtures deviates substantially between the grid cells. These effects result in a performance profile that is unbalanced and rapidly changing for transient simulations, and therefore beyond the capabilities of traditional (quasi-)static mesh partitioning methods. We analyse this loss of parallel efficiency for large-eddy simulations of the ECN Spray-A benchmark with the multi-physics solver INCA and propose to mitigate the problem by introducing two independent repartitioning stages in addition to the classic domain decomposition for fluid transport: one for the EoS and one for chemical reactions. We explore various scalable repartitioning strategies in this context and observe that rebalancing computational load yields a significant speedup that is robust for various mesh resolutions and process numbers. The dynamic multistage load-balancing thus effectively removes obstacles towards good parallel scaling of INCA and similar solvers for reacting and/or multiphase flows.
Designing a cloud computing environment is a tradeoff performance between costs and way of providing services that System of virtualized networks may affect it. In order to alleviate this, abstractions have to be deve...
详细信息
Leakage in water distribution networks precipitates both water wastage and the ingress of pollutants. The localization of leaks, a formidable challenge within water demand management, has spurred an examination of hyd...
详细信息
Leakage in water distribution networks precipitates both water wastage and the ingress of pollutants. The localization of leaks, a formidable challenge within water demand management, has spurred an examination of hydraulic simulation-based methodologies as a more economically feasible and time-efficient alternative to conventional methods. This paper introduces a framework for precisely determining the location of leaks within a water distribution network, leveraging the Grasshopper Optimization Algorithm. The approach meticulously compares simulated data with pressure field information. Acknowledging the intrinsic uncertainties pertaining to hydraulic model parameters-such as elevations, nodal base demand, and pipe roughness coefficients in real-world water distribution networks-the developed method incorporates perturbation analysis for judicious parameter selection. Monte Carlo simulation is then employed to apply these parameters in the simulation process systematically. The efficacy of the method is demonstrated by applying it to benchmark water distribution networks (specifically, Poulakis and Balerma) under various leakage scenarios, achieving accuracy levels of up to 99%. Introducing uncertainty into the simulation process results in a maximum 20% reduction in method accuracy. Real-world implementation successfully and accurately localizes leakage, affirming the practical applicability of the proposed method for water utilities.
This special issue is dedicated to examining the rapidly evolving fields of artificial intelligence, mathematical modeling, and optimization, with particular emphasis on their growing importance in computational scien...
详细信息
This special issue is dedicated to examining the rapidly evolving fields of artificial intelligence, mathematical modeling, and optimization, with particular emphasis on their growing importance in computational science. It features the most notable papers from the "Mathematical Modeling and Problem Solving" workshop at PDPTA'24, the 30th internationalconference on Parallel and Distributed Processing Techniques and Applications. The issue showcases pioneering research in areas such as natural language processing, system optimization, and high-performancecomputing. The nine selected studies include novel AI-driven methods for chemical compound generation, historical text recognition, and music recommendation, along with advancements in hardware optimization through reconfigurable accelerators and vector register sharing. Additionally, evolutionary and hyper-heuristic algorithms are explored for sophisticated problem-solving in engineering design, and innovative techniques are introduced for high-speed numerical methods in large-scale systems. Collectively, these contributions demonstrate the significance of AI, supercomputing, and advanced algorithms in driving the next generation of scientific discovery.
The rotating machineries of automobile, such as gears and motors, are involved with complicated interactions between fluids and structures, resulting in flow phenomenon such as free-surface, moving boundary, thermal c...
详细信息
The rotating machineries of automobile, such as gears and motors, are involved with complicated interactions between fluids and structures, resulting in flow phenomenon such as free-surface, moving boundary, thermal conduction etc. Smoothed Particle Hydrodynamics (SPH), due to its Lagrangian nature, is preferred to simulate such phenomenon. The complexities of automobile structures require small particle distancing and therefore large number of particles are needed to discretize both fluid and structure. The small time step of SPH simulation is also required due to intense flowsplashing resulting from high-speedmoving boundary. Both two points above lead to large amount of computation duringSPHsimulation. In this paper, a parallelism framework of weakly compressible SPH(WCSPH) is proposed to accelerate SPH simulation by high-performancecomputing cluster. A hybrid parallelism strategy, with both Message Passing Interface (MPI) and Intel Threading Building Blocks (TBB), is used to reduce the total number of processes and therefore reduce the latency due to communication among computing clusters network. METIS is used to decompose the computational domain enabling dynamic domain decomposition and load balancing. The oil motion inside a gearbox is successfully simulated using the proposed framework, showing that the proposed parallelism framework is applicable to complex industry application and can accelerate SPH simulation efficiently.
In high-performancecomputing (HPC), multi-threaded applications using OpenMP face complex challenges in identifying hidden performance issues, often due to resource conflicts, software inefficiencies, and hardware an...
详细信息
ISBN:
(纸本)9783031814037;9783031814044
In high-performancecomputing (HPC), multi-threaded applications using OpenMP face complex challenges in identifying hidden performance issues, often due to resource conflicts, software inefficiencies, and hardware anomalies. These subtle issues can significantly degrade performance and reduce system reliability. This paper introduces an innovative approach designed to address these concealed issues in OpenMP multi-threaded applications. The proposed method integrates a Random Forest classifier with anthropomorphic diagnosis to effectively identify and diagnose performance-affecting problems. The approach has demonstrated a remarkable ability to detect 90% of performance-affecting issues that are often obscured within complex HPC environments.
Heterogeneous computing and exploiting integrated CPU-GPU architectures has become a clear current trend since the flattening of Moore's Law. In this work, we propose a numerical and algorithmic re-design of a p-a...
详细信息
Heterogeneous computing and exploiting integrated CPU-GPU architectures has become a clear current trend since the flattening of Moore's Law. In this work, we propose a numerical and algorithmic re-design of a p-adaptive quadrature-free discontinuous Galerkin (DG) method for the shallow water equations. Our new approach separates the computations of the non-adaptive (lower-order) and adaptive (higher-order) parts of the discretization from each other. Thereby, we can overlap computations of the lower-order and the higher-order DG solution components. Furthermore, we investigate execution times of main computational kernels and use automatic code generation to optimize their distribution between the CPU and GPU. Several setups, including a prototype of a tsunami simulation in a tide-driven flow scenario, are investigated, and the results show that significant performance improvements can be achieved in suitable setups.
This paper proposes a research scheme for efficient processing algorithm of engineering cost data based on cloud computing platform, aiming to improve data processing efficiency by utilizing the high-performance compu...
详细信息
Ferroelectric semiconductor transistor is a newly proposed device that uses ferroelectric semiconductors as channel materials for integrated memory and computation. Currently, the main challenge in advancing ferroelec...
详细信息
Ferroelectric semiconductor transistor is a newly proposed device that uses ferroelectric semiconductors as channel materials for integrated memory and computation. Currently, the main challenge in advancing ferroelectric semiconductor transistors (FeS-FETs) is finding ferroelectric channel materials that balance highperformance with industrial production feasibility. In this work, we predict the performance of alpha-GeTe, a quasi-two-dimensional ferroelectric semiconductor with excellent compatibility with Si-based substrates, as a FeS-FET by ab initio quantum transport simulation. When taking negative capacitance technology and underlap structure into account, we find that alpha-GeTe ferroelectric semiconductor transistors can meet the international technology roadmap for semiconductors for high-performance standards for industrial-grade chip logic operations with a 5-nm channel length, and achieve a ferroelectric switch ratio of 228 at zero gate voltage. The memory window (0.9 V) of the 5-nm gate-length monolayer alpha-GeTe FeS-FETs is three times larger than that (0.3 V) of the alpha-In2Se3 ferroelectric semiconductor transistor. Our work suggests that alpha-GeTe is a strong candidate for the future industrial fabrication of FeS-FETs.
high-performancecomputing is pivotal for processing large datasets and executing complex simulations, ensuring faster and more accurate results. Improving the performance of software and scientific workflows in such ...
详细信息
ISBN:
(纸本)9783031800832;9783031800849
high-performancecomputing is pivotal for processing large datasets and executing complex simulations, ensuring faster and more accurate results. Improving the performance of software and scientific workflows in such environments requires careful analysis of their computational behavior and energy consumption. Therefore, maximizing computational throughput in these environments, through adequate software configuration and resource allocation, is essential for improving performance. The work presented in this paper focuses on leveraging regression-based machine learning and decision trees to analyze and optimize resource allocation in high-performancecomputing environments based on application's performance and energy metrics. Applied to a bioinformatics case study, these models enable informed decision-making by selecting the appropriate computing resources to enhance the performance of a phylogenomics software. Our contribution is to better explore and understand the efficient resource management of supercomputers, namely Santos Dumont. We show that the predictions for application's execution time using the proposed method are accurate for various amounts of computing nodes, while energy consumption predictions are less precise. The application parameters most relevant for this work are identified and the relative importance of each application parameter to the accuracy of the prediction is analysed.
暂无评论