With the development of engineering technology, engineering has higher requirements for the accuracy and the scale of simulation calculation. The computational efficiency of traditional serial programs cannot meet the...
详细信息
With the development of engineering technology, engineering has higher requirements for the accuracy and the scale of simulation calculation. The computational efficiency of traditional serial programs cannot meet the requirements of engineering. Therefore, reducing the calculation time of the temperature control simulation program has important engineering significance for real-time simulation of temperature field and stress field, and then adopting more reasonable temperature control and crack prevention measures. GPU parallel computing is introduced into the temperature control simulation program of massive concrete to solve this problem and the optimization is carried out. Considering factors such as GPU clock rate, number of cores, parallel overhead and Parallel Region, the improved GPU parallel algorithm analysis indicator formula is proposed. It makes up for the shortcomings of traditional formulas that focus only on time. According to this formula, when there are enough threads, the parallel effect is limited by the size of the parallel domain, and when the parallel domain is large enough, the efficiency is limited by the parallel overhead and the clock rate. This paper studies the optimal Kernel execution configuration. Shared memory is utilized to improve memory access efficiency by 155%. After solving the problem of bank conflicts, an accelerate rate of 437.5x was realized in the subroutine of the matrix transpose of the solver. The asynchronous parallel of data access and logical operation is realized on GPU by using CUDA Stream, which can overlap part of the data access time. On the basis of GPU parallelism, asynchronous parallelism can double the computing efficiency. Compared with the serial program, the accelerate rate of inner product matrix multiplication of the GPU asynchronous parallel program is 61.42x. This study further proposed a theoretical formula of data access overlap rate to guide the selection of the number of CUDA streams to achieve th
A geometrical description of square polycapillary x-ray optics and the basic theory of the transmission of x-rays are presented. A method of numerical calculation is developed based on ray-tracing theory. The method s...
详细信息
A geometrical description of square polycapillary x-ray optics and the basic theory of the transmission of x-rays are presented. A method of numerical calculation is developed based on ray-tracing theory. The method simulates the intensity distribution of x-rays propagating through slice square polycapillary x-ray optics. The simulation results are compared with the experimental results.
The geometrical description of capillary systems adjusted for the controlled guiding of X-rays and the basic theory of the transmission of X-rays are presented. A method of numerical calculation, based on Ray Tracing ...
详细信息
The geometrical description of capillary systems adjusted for the controlled guiding of X-rays and the basic theory of the transmission of X-rays are presented. A method of numerical calculation, based on Ray Tracing theory, is developed to simulate the transmission efficiency of an X-ray parallel lens and the shape and size of the light spot gain from it The simulation results for two half lenses are in good agreement with the experimental results. (C) 2015 Elsevier B.V. All rights reserved,
暂无评论