检索结果-内蒙古大学图书馆

An accelerated explicit method with gpu parallel computing for thermal stress and welding deformation of large structure models

引用

INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY 2016年第5-8期87卷 2195-2211页

作者： Ma, Ninshu Osaka Univ 11-1 Mihogaoka Osaka 5670047 Japan JSOL Corp Nishi Ku 2-2-4 Tosaori Osaka 5500001 Japan

To simulate welding induced transient thermal stress and deformation of large scale FE models, an accelerated explicit method (ACEXP) and graphical processing units (gpu) parallel computing program of the finite element method (FEM) were developed. In the accelerated explicit method, a two-stage computation scheme is employed. The first computation stage is based on a dynamic explicit method considering the characteristics of the welding mechanical process by controlling both the temperature increment and time scaling parameter. In the second computation stage, a static equilibrium computation scheme is implemented after dynamic thermal loading to obtain a static solution of transient thermal stress and welding deformation. It has been demonstrated that the developed gpu parallel computing program has a good scalability for large-scale models of more than 20 million degrees of freedom. The validity of the accelerated explicit method is verified by comparing the transient thermal stress and deformation with those computed by an implicit FEM. Finally, welding deformation and residual stress in a structure model assembled from nine high-strength steel plates and 26 weld lines were efficiently analyzed by ACEXP and gpu parallel computing within 45 h. The computed welding deformation agreed well with measured results, and a good accuracy was obtained.

关键词： Accelerated explicit method gpu parallel computing Thermal stress Welding deformation Large scale models Welded structure

来源：评论

学校读者我要写书评

暂无评论

Accelerating population balance-Monte Carlo simulation for coagulation dynamics from the Markov jump model, stochastic algorithm and gpu parallel computing

引用

JOURNAL OF COMPUTATIONAL PHYSICS 2015年 281卷 844-863页

作者： Xu, Zuwei Zhao, Haibo Zheng, Chuguang Huazhong Univ Sci & Technol State Key Lab Coal Combust Wuhan 430074 Peoples R China

This paper proposes a comprehensive framework for accelerating population balance-Monte Carlo (PBMC) simulation of particle coagulation dynamics. By combining Markov jump model, weighted majorant kernel and gpu (graphics processing unit) parallel computing, a significant gain in computational efficiency is achieved. The Markov jump model constructs a coagulation-rule matrix of differentially-weighted simulation particles, so as to capture the time evolution of particle size distribution with low statistical noise over the full size range and as far as possible to reduce the number of time loopings. Here three coagulation rules are highlighted and it is found that constructing appropriate coagulation rule provides a route to attain the compromise between accuracy and cost of PBMC methods. Further, in order to avoid double looping over all simulation particles when considering the two-particle events (typically, particle coagulation), the weighted majorant kernel is introduced to estimate the maximum coagulation rates being used for acceptance-rejection processes by single-looping over all particles, and meanwhile the mean time-step of coagulation event is estimated by summing the coagulation kernels of rejected and accepted particle pairs. The computational load of these fast differentially-weighted PBMC simulations (based on the Markov jump model) is reduced greatly to be proportional to the number of simulation particles in a zero-dimensional system (single cell). Finally, for a spatially inhomogeneous multi-dimensional (multi-cell) simulation, the proposed fast PBMC is performed in each cell, and multiple cells are parallel processed by multi-cores on a gpu that can implement the massively threaded data-parallel tasks to obtain remarkable speedup ratio (comparing with CPU computation, the speedup ratio of gpu parallel computing is as high as 200 in a case of 100 cells with 10000 simulation particles per cell). These accelerating approaches of PBMC are demonstrated

关键词： Population balance Monte Carlo Coagulation Markov jump Majorant kernel gpu parallel computing

来源：评论

学校读者我要写书评

暂无评论

An Accelerated Explicit Method and gpu parallel computing for Thermal Stress and Welding Deformation of Automotive Parts

引用

INTERNATIONAL JOURNAL OF APPLIED MECHANICS 2016年第2期8卷

作者： Ma, Ninshu Yuan, Shijian Osaka Univ Joining & Welding Res Inst 11-1 Mihogaoka Osaka 5670047 Japan JSOL Corp Engn Technol Div Nishi Ku 2-2-4 Tosabori Osaka 5500001 Japan Harbin Inst Technol Sch Mat Sci & Technol 92 West Dazhi St Harbin 15001 Peoples R China

An accelerated explicit method and gpu parallel computing program of finite element method (FEM) are developed for simulating transient thermal stress and welding deformation in large scale models. In the accelerated explicit method, a two-stage computation scheme is employed. The first computation stage is based on a dynamic explicit method considering the characteristics of the welding mechanical process by controlling both the temperature increment and time scaling parameter. In the second computation stage, a static equilibrium computation scheme is implemented after thermal loading to obtain a static solution of transient thermal stress and welding deformation. It has been demonstrated that the developed gpu parallel computing program has a good scalability for large scale models of more than 20 million degrees of freedom (DOFs). The validity of the accelerated explicit method is verified by comparing the transient thermal deformation and residual stresses with those computed by the implicit FEM and experimental measurements. Finally, the thermal stress and strain in an automotive engine cradle model with more than 12 million DOFs were efficiently computed and the results are discussed.

关键词： Accelerated explicit method gpu parallel computing large scale models thermal stress welding deformation automotive parts

来源：评论

学校读者我要写书评

暂无评论

Performance Enhancement of gpu parallel computing Using Memory Allocation Optimization 14

Performance Enhancement of GPU Parallel Computing Using Memo...

引用

14th International Conference on Ubiquitous Information Management and Communication (IMCOM)

作者： Lin, Chu-Hsing Liu, Jung-Chun Yang, Po-Kai Tunghai Univ Dept Comp Sci Taichung Taiwan

ISBN: (纸本)9781728154534

The Fourier transform converts a signal from its original domain to a representation in the frequency domain. Applications of the Fourier Transform are far-reaching, spanning fields such as intelligent information processing, machine vision, physics, mathematics, medical science, and telecommunications;hence, its applications have become an indispensable part in our daily life. Therefore, it is essential to construct efficient and high-reliability schemes to guarantee smooth performance of the systems using Fourier Transforms. This study compares performances of Fast Fourier Transforms on a host CPU, gpu parallel computing, and gpu parallel computing with memory allocation optimization. From the experimental results, gpu parallel computing is proven to be effective in enhancing computation speed of the FFT;the speedup ratio of gpu parallel computing over the CPU can reach 48 when operating on 32678 8-byte complex input data. In addition, by optimizing gpu memory allocation, the computation speed of the FFT can be further enhanced;the speedup ratio of gpu parallel computing with memory allocation optimization over the CPU can reach 114.7 when operating on 32678 8-byte complex input data.

关键词： intelligent information processing parallel FFT gpu parallel computing memory optimization speedup ratio

来源：评论

学校读者我要写书评

暂无评论

Petroleum Geoscience Big Data and gpu parallel computing 1

Petroleum Geoscience Big Data and GPU Parallel Computing

引用

IEEE First International Conference on Multimedia Big Data

作者： Han, Fei Sun, Sam Z. China Univ Petr Lab Integrat Geol & Geophys LIGG Beijing Peoples R China

Petroleum geoscience big data is defined in this paper. CPU/gpu hybrid system is used to try to accelerate computing speed of petroleum geoscience big data using chaotic quantum particle swarm optimization (CQPSO) inv... 详细信息

ISBN: (纸本)9781479986880

关键词： Petroleum geoscience big data gpu parallel computing

来源：评论

学校读者我要写书评

暂无评论

Object Detection Based on gpu parallel computing for RoboCup Middle Size League

Object Detection Based on GPU Parallel Computing for RoboCup...

引用

IEEE International Conference on Robotics and Biomimetics (ROBIO)

作者： Luo, Sha Yao, Weijia Yu, Qinghua Xiao, Junhao Lu, Huimin Zhou, Zongtan Natl Univ Def Technol Coll Mechatron Engn & Automat Changsha Peoples R China

ISBN: (纸本)9781538637425

The RoboCup Middle Size League (MSL) robot soccer competition is a standard test platform for distributed multi-robot systems. There are many challenges in the vision system for MSL soccer robots. For example, huge amount of data from the Kinect v2 sensor leads to heavy computation burden for the robot's onboard industrial computer, the obstacle-detection algorithm is mainly dependent on the obstacle' colors, the omnidirectional vision system is not able to detect the ball above the camera and get the objects' height information. In this paper, we proposed an algorithm for object detection based on gpu parallel computing employing Kinect v2 and Jetson TX1 as the hardware platform. parallel computing is utilized throughout all the steps of the object detection algorithm, so the speed and accuracy of the algorithm are greatly improved. We test the real-time performance and the accuracy of the algorithm using our NuBot soccer robots. The experimental results show that objects can be detected and their 3-D information can be obtained accurately, satisfying the real-time requirements of the MSL competition and decreasing the robot's onboard computer's CPU burden. In addition, the proposed algorithm for obstacle detection is not dependent on a specific color.

关键词： object detection gpu parallel computing RoboCup MSL Jetson TX1 Kinect v2

来源：评论

学校读者我要写书评

暂无评论

An Efficient BP Algorithm Based on TSU-ICSI Combined with gpu parallel computing

引用

REMOTE SENSING 2023年第23期15卷 5529-5529页

作者： Li, Ziya Qiu, Xiaolan Yang, Jun Meng, Dadi Huang, Lijia Song, Shujie Chinese Acad Sci Aerosp Informat Res Inst Beijing 100094 Peoples R China Chinese Acad Sci Key Lab Geospatial Informat Proc & Applicat Syst Beijing 100190 Peoples R China Univ Chinese Acad Sci Sch Elect Elect & Commun Engn Beijing 100049 Peoples R China Suzhou Key Lab Microwave Imaging Proc & Applicat T Suzhou 215124 Peoples R China Suzhou Aerosp Informat Res Inst Suzhou 215124 Peoples R China

High resolution remains a primary goal in the advancement of synthetic aperture radar (SAR) technology. The backprojection (BP) algorithm, which does not introduce any approximation throughout the imaging process, is broadly applicable and effectively meets the demands for high-resolution imaging. Nonetheless, the BP algorithm necessitates substantial interpolation during point-by-point processing, and the precision and effectiveness of current interpolation methods limit the imaging performance of the BP algorithm. This paper proposes a TSU-ICSI (Time-shift Upsampling-Improved Cubic Spline Interpolation) interpolation method that integrates time-shift upsampling with improved cubic spline interpolation. This method is applied to the BP algorithm and presents an efficient implementation method in conjunction with the gpu architecture. TSU-ICSI not only maintains the accuracy of BP imaging processing but also significantly boosts performance. The effectiveness of the BP algorithm based on TSU-ICSI is confirmed through simulation experiments and by processing measured data collected from both airborne SAR and spaceborne SAR.

关键词： synthetic aperture radar (SAR) backprojection algorithm (BPA) gpu parallel computing improved cubic spline interpolation upsampling

来源：评论

学校读者我要写书评

暂无评论

Fast Simulation of Large-Scale Floods Based on gpu parallel computing

引用

WATER 2018年第5期10卷 589页

作者： Liu, Qiang Qin, Yi Li, Guodong Xian Univ Technol Minist Educ Key Lab North West Water Resources & Ecol Environ Xian 710048 Shaanxi Peoples R China

computing speed is a significant issue of large-scale flood simulations for real-time response to disaster prevention and mitigation. Even today, most of the large-scale flood simulations are generally run on supercomputers due to the massive amounts of data and computations necessary. In this work, a two-dimensional shallow water model based on an unstructured Godunov-type finite volume scheme was proposed for flood simulation. To realize a fast simulation of large-scale floods on a personal computer, a Graphics Processing Unit (gpu)-based, high-performance computing method using the OpenACC application was adopted to parallelize the shallow water model. An unstructured data management method was presented to control the data transportation between the gpu and CPU (Central Processing Unit) with minimum overhead, and then both computation and data were offloaded from the CPU to the gpu, which exploited the computational capability of the gpu as much as possible. The parallel model was validated using various benchmarks and real-world case studies. The results demonstrate that speed-ups of up to one order of magnitude can be achieved in comparison with the serial model. The proposed parallel model provides a fast and reliable tool with which to quickly assess flood hazards in large-scale areas and, thus, has a bright application prospect for dynamic inundation risk identification and disaster assessment.

关键词： flood modeling shallow water equations finite volume scheme gpu parallel computing numerical simulation

来源：评论

学校读者我要写书评

暂无评论

A Rayleigh Wave Globally Optimal Full Waveform Inversion Framework Based on gpu parallel computing

引用

Journal of Geoscience and Environment Protection 2023年第3期11卷 327-338页

作者： Zhao Le Wei Zhang Xin Rong Yiming Wang Wentao Jin Zhengxuan Cao School of Geophysics and Geomatics China University of Geosciences Wuhan China Wuhan Geo-Detection Technology Co. Ltd. Wuhan China Zhejiang Design Institute of Water Conservancy & Hydro-Electric Power Co. Ltd. Hangzhou China

Conventional gradient-based full waveform inversion (FWI) is a local optimization, which is highly dependent on the initial model and prone to trapping in local minima. Globally optimal FWI that can overcome this limitation is particularly attractive, but is currently limited by the huge amount of calculation. In this paper, we propose a globally optimal FWI framework based on gpu parallel computing, which greatly improves the efficiency, and is expected to make globally optimal FWI more widely used. In this framework, we simplify and recombine the model parameters, and optimize the model iteratively. Each iteration contains hundreds of individuals, each individual is independent of the other, and each individual contains forward modeling and cost function calculation. The framework is suitable for a variety of globally optimal algorithms, and we test the framework with particle swarm optimization algorithm for example. Both the synthetic and field examples achieve good results, indicating the effectiveness of the framework. .

关键词： Full Waveform Inversion Finite-Difference Method Globally Optimal Framework gpu parallel computing Particle Swarm Optimization

来源：评论

学校读者我要写书评

暂无评论

Application of optical super-resolution imaging based on gpu parallel computing in AI motion training system

引用

Optical and Quantum Electronics 2024年第4期56卷 682-682页

作者： Yu, Haiyang Wang, Li School of Physical Education Fuyang Normal University Fuyang 236037 China School of Economics Fuyang Normal University Fuyang 236037 China

In the AI sports training system, the traditional optical imaging technology limits the resolution of the image. Therefore, the use of optical super-resolution imaging technology to improve image resolution can promote the further development of AI motion training systems. In this study, high-resolution images and corresponding low-resolution images are collected as training data, and an optical superresolution imaging model based on deep learning is established. The gpu parallel computing technology is used to accelerate the training and inference process of the model. Finally, the optimized high-resolution image is applied to the AI sports training system, and the experimental evaluation is carried out. The experimental results show that the optical superresolution imaging technology based on gpu parallel computing can significantly improve the resolution and clarity of the image. Compared with the traditional optical imaging technology, the image processed by optical superresolution imaging has better performance in detail and edge. Through the test of the motion capture system, it is observed that the images processed by optical super-resolution imaging can detect and analyze the motion details more accurately. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.

关键词： AI gpu parallel computing Optical superresolution imaging Sports training

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：