Computations on structured grids using standard multidimensional array layouts can incur substantial data movement costs through the memory hierarchy. This paper explores the benefits of using a framework (Bricks) to ...
详细信息
ISBN:
(纸本)9781665472838
Computations on structured grids using standard multidimensional array layouts can incur substantial data movement costs through the memory hierarchy. This paper explores the benefits of using a framework (Bricks) to separate the complexity of data layout and optimized communication from the functional representation. To that end, we provide three novel contributions and evaluate them on several kernels taken from GENE, a phase-space fusion tokamak simulation code. We extend Bricks to support 6-dimensional arrays and kernels that operate on complex data types, and integrate Bricks with cuFFT. We demonstrate how to optimize Bricks for data reuse, spatial locality, and GPU hardware utilization achieving up to a 2.67 × speedup on a single A100 GPU. We conclude with insights on how to rearchitect memory subsystems.
作者:
Skala, VaclavThe University of West Bohemia
Faculty of Applied Sciences Department .of Computer Science and Engineering Center of Computer Graphics and Visualization Univerzitni 8 PlzenCZ 30614 Czech Republic
Acceleration of algorithms is becoming a crucial problem, if larger data sets are to be processed. Evaluation of algorithms is mostly done by using computational geometry approach and evaluation of computational compl...
详细信息
Deadlock resolution strategies based on siphon control are widely *** computational efficiency largely depends on siphon ***-integer programming(MIP)can be utilized for the computation of an emptiable siphon in a Petr...
详细信息
Deadlock resolution strategies based on siphon control are widely *** computational efficiency largely depends on siphon ***-integer programming(MIP)can be utilized for the computation of an emptiable siphon in a Petri net(PN).Based on it,deadlock resolution strategies can be designed without requiring complete siphon enumeration that has exponential *** to this reason,various MIP methods are proposed for various subclasses of *** work proposes an innovative MIP method to compute an emptiable minimal siphon(EMS)for a subclass of PNs named S^(4)*** particular,many particular structural characteristics of EMS in S4 PR are formalized as constraints,which greatly reduces the solution *** results show that the proposed MIP method has higher computational ***,the proposed method allows one to determine the liveness of an ordinary S^(4)PR.
In recent years, the impressive growth of new wireless technologies, together with the appearance of new requirements in applications and services, is progressively changing the use of networks. Due to the high mobili...
详细信息
The accurate classification of Motor Imagery (MI) electroencephalography (EEG) signals is crucial for advancing Brain-computer Interface (BCI) technologies, particularly for individuals with disabilities. In this stud...
详细信息
LiDAR semantic segmentation plays a vital role in autonomous driving. Existing voxel-based methods for LiDAR semantic segmentation apply uniform partition to the 3D Li- DAR point cloud to form a structured representat...
详细信息
Large language models (LLMs) are a transformational capability at the frontier of artificial intelligence and machine learning that can support decision-makers in addressing pressing societal challenges such as extrem...
详细信息
This paper studies the problem of solving the system of nonlinear equations. We propose the Gram-reduced Levenberg-Marquardt method, which reuses the Gram matrix. Our method has a global convergence guarantee without ...
详细信息
Detecting abnormal velocity targets is critical especially in public surveillance for crowd activity monitoring. Although there are some explorations on one such issue, traditional methods still perform poorly in comp...
详细信息
ISBN:
(数字)9798350388404
ISBN:
(纸本)9798350388411
Detecting abnormal velocity targets is critical especially in public surveillance for crowd activity monitoring. Although there are some explorations on one such issue, traditional methods still perform poorly in complex scenes. In this paper, a novel visual neural network is investigated to perceive abnormal velocity targets in moving crowd, based on the latest neurophysiological achievements revealed in locusts’ vision systems. The proposed neural network contains two neural counterparts, i.e., the presynaptic and the postsynaptic networks. The former one receives visual signals and processes them to capture the motion cues of different targets, and the latter one filters the salience energies to perceive abnormal velocity targets in the field of view. Numerical experiments carried out show that the proposed neural network can effectively detect abnormal velocity targets in moving crowd. This study is an important step towards dynamic visual information processing in crowd behavior analysis.
The world is moving towards clean and renewable energy sources, such as wind energy, in an attempt to reduce greenhouse gas emissions that contribute to global warming. To enhance the analysis and storage of wind data...
详细信息
暂无评论