A multi-GPU implementation of the multilevel fast multipole algorithm (MLFMA) based on the hybrid OpenMP-CUDA parallel programming model (OpenMP-CUDA-MLFMA) is presented for computing electromagnetic scattering of a t...
详细信息
A multi-GPU implementation of the multilevel fast multipole algorithm (MLFMA) based on the hybrid OpenMP-CUDA parallel programming model (OpenMP-CUDA-MLFMA) is presented for computing electromagnetic scattering of a three-dimensional conducting object. The proposed hierarchical parallelization strategy ensures a high computational throughput for the GPU calculation. The resulting OpenMP-based multi-GPU implementation is capable of solving real-life problems with over one million unknowns with a remarkable speed-up. The radar cross sections of a few benchmark objects are calculated to demonstrate the accuracy of the solution. The results are compared with those from the CPU-based MLFMA and measurements. The capability and efficiency of the presented method are analyzed through the examples of a sphere, an aerocraft, and a missile-like object. Compared with the 8-threaded CPU-based MLFMA, the OpenMP-CUDA-MLFMA method can achieve from 5 to 20 total speed-up ratios.
Lagrange interpolation of the translation operator in the three-dimensional multilevel fast multipole algorithm (MLFMA) is revisited. Parameters of the interpolation, namely, the number of interpolation points and the...
详细信息
Lagrange interpolation of the translation operator in the three-dimensional multilevel fast multipole algorithm (MLFMA) is revisited. Parameters of the interpolation, namely, the number of interpolation points and the oversampling factor, are optimized for controllable error. Via optimization, it becomes possible to obtain the desired level of accuracy with the minimum processing time.
We consider the efficient solution of electromagnetics problems involving dielectric and composite dielectric-metallic structures, formulated with the electric and magnetic current combined-field integral equation (JM...
详细信息
We consider the efficient solution of electromagnetics problems involving dielectric and composite dielectric-metallic structures, formulated with the electric and magnetic current combined-field integral equation (JMCFIE). Dense matrix equations obtained from the discretization of JMCFIE with Rao-Wilton-Glisson functions are solved iteratively, where the matrix-vector multiplications are performed efficiently with the multilevel fast multipole algorithm. JMCFIE usually provides well conditioned matrix equations that are easy to solve iteratively. However, iteration counts and the efficiency of solutions depend on the contrast, i.e., the relative variation of electromagnetic parameters across dielectric interfaces. Owing to the numerical imbalance of off-diagonal matrix partitions, solutions of JMCFIE become difficult with increasing contrast. We present a four-partition block-diagonal preconditioner (4PBDP), which provides efficient solutions of JMCFIE by reducing the number of iterations significantly. 4PBDP is useful, especially when the contrast increases, and the standard block-diagonal preconditioner fails to provide a rapid convergence.
In multilevel fast multipole algorithm (MLFMA) the direct evaluation of the Rokhlin translator is computationally expensive, and usually the cost is lowered by using local Lagrange interpolation in the evaluation, whi...
详细信息
In multilevel fast multipole algorithm (MLFMA) the direct evaluation of the Rokhlin translator is computationally expensive, and usually the cost is lowered by using local Lagrange interpolation in the evaluation, which requires oversampling of the translator. In this paper we improve the interpolation procedure by introducing a new, accurate, and fast oversampling technique based on the fast Fourier transform (FFT). In addition to speeding up the oversampling this also allows the use of lower number of points in the interpolation stencils improving the efficiency of the evaluation of the Rokhlin translator. We have optimized the interpolation parameters, i.e., the number of the stencil points and the oversampling factor, by using as the error criterion the accuracy in the translated (incoming) field rather than the usually used interpolation error. This choice leads to better optimized parameter pairs which further lowers the interpolation cost. We have computed and tabulated the optimized pairs for a wide range of target accuracies and the MLFMA division levels. These tables can be used for a good error control and maximal speed-up in practical computations.
A parallel scheme that combines the OpenMP and the vector arithmetic logic unit (VALU) hardware acceleration is presented to speed up the multilevel fast multipole algorithm (MLFMA) on shared-memory computers. Perform...
详细信息
A parallel scheme that combines the OpenMP and the vector arithmetic logic unit (VALU) hardware acceleration is presented to speed up the multilevel fast multipole algorithm (MLFMA) on shared-memory computers. Performance of the hybrid parallel OpenMP-VALU MLFMA is investigated and several strategies are employed to improve the overall speedup and parallel efficiency. Effectiveness of the hybrid parallel scheme is verified by numerical results of the electromagnetic (EM) scattering examples, and related numerical stability issue is discussed as well.
We consider fast and efficient optimizations of arrays involving three-dimensional antennas with arbitrary shapes and geometries. Heuristic algorithms, particularly genetic algorithms, are used for optimizations, whil...
详细信息
We consider fast and efficient optimizations of arrays involving three-dimensional antennas with arbitrary shapes and geometries. Heuristic algorithms, particularly genetic algorithms, are used for optimizations, while the required solutions are carried out accurately and efficiently via the multilevel fast multipole algorithm(MLFMA). The superposition principle is employed to reduce the number of MLFMA solutions to the number of array elements per frequency. The developed mechanism is used to optimize arrays for multifrequency and/or multidirection operations, i.e., to find the most suitable set of antenna excitations for desired radiation characteristics simultaneously at different frequencies and/or directions. The capabilities of the optimization environment are demonstrated on arrays of bowtie and Vivaldi antennas.
We present a novel hierarchical partitioning strategy for the efficient parallelization of the multilevel fast multipole algorithm (MLFMA) on distributed-memory architectures to solve large-scale problems in electroma...
详细信息
We present a novel hierarchical partitioning strategy for the efficient parallelization of the multilevel fast multipole algorithm (MLFMA) on distributed-memory architectures to solve large-scale problems in electromagnetics. Unlike previous parallelization techniques, the tree structure of MLFMA is distributed among processors by partitioning both clusters and samples of fields at each level. Due to the improved load-balancing, the hierarchical strategy offers a higher parallelization efficiency than previous approaches, especially when the number of processors is large. We demonstrate the improved efficiency on scattering problems discretized with millions of unknowns. In addition, we present the effectiveness of our algorithm by solving very large scattering problems involving a conducting sphere of radius 210 wavelengths and a complicated real-life target with a maximum dimension of 880 wavelengths. Both of the objects are discretized with more than 200 million unknowns.
We consider accurate full-wave solutions of plasmonic problems using the multilevel fast multipole algorithm (MLFMA). Metallic structures at optical frequencies are modeled by using the Lorentz-Drude model, formulated...
详细信息
We consider accurate full-wave solutions of plasmonic problems using the multilevel fast multipole algorithm (MLFMA). Metallic structures at optical frequencies are modeled by using the Lorentz-Drude model, formulated with surface integral equations, and analyzed iteratively via MLFMA. Among alternative choices, the electric and magnetic current combined-field integral equation (JMCFIE) and the combined tangential formulation (CTF), which are popular integral-equation formulations for penetrable objects, are discretized with the conventional Rao-Wilton-Glisson functions and used to model plasmonic structures. We discuss electromagnetic interactions in plasmonic media and show how far-field interactions may be omitted for improving the efficiency without sacrificing the accuracy of results. (c) 2016 Wiley Periodicals, Inc. Int J RF and Microwave CAE 26:335-341, 2016.
The computational error of the multilevel fast multipole algorithm is studied. The error convergence rate, achievable minimum error, and error bound are investigated for various element distributions. We will discuss ...
详细信息
The computational error of the multilevel fast multipole algorithm is studied. The error convergence rate, achievable minimum error, and error bound are investigated for various element distributions. We will discuss the boundary between the large and small buffer cases in terms of machine precision. The needed buffer size to reach double precision accuracy will be clarified.
作者:
Zhao, JSChew, WCUniv Illinois
Dept Elect & Comp Engn Ctr Computat Electromagnet Electromagnet Lab Urbana IL 61801 USA
A normalized three-dimensional (3-D) multilevel fast multipole algorithm (MLFMA) with a computational complexity of O(N)for very low-frequency (LF) problems is developed. This 3-D LF-MLFMA can be used nor only indepen...
详细信息
A normalized three-dimensional (3-D) multilevel fast multipole algorithm (MLFMA) with a computational complexity of O(N)for very low-frequency (LF) problems is developed. This 3-D LF-MLFMA can be used nor only independently for very low-frequency cases or very small structures compared to the wavelength, but also to solve large-scale structures with rapidly varying areas when merged with a general dynamic algorithm. From the LF-MLFMA, a more explicit and succinct representation of the static MLFMA is also derived. (C) 2000 John Wiley & Sons, Inc.
暂无评论