作者:
Zhao, JSChew, WCUniv Illinois
Dept Elect & Comp Engn Electromagnet Lab Ctr Computat Electromagnet Urbana IL 61801 USA
The matrix rotation technique is applied to three-dimensional (3-D) multilevel fast multipole algorithm (MLFMA) at very low frequencies (LF) to save the storage without increasing the order of the floating-point opera...
详细信息
The matrix rotation technique is applied to three-dimensional (3-D) multilevel fast multipole algorithm (MLFMA) at very low frequencies (LF) to save the storage without increasing the order of the floating-point operations. Some symmetrical properties of the translation matrices along the z-direction for the 3-D LF-MLFMA and the static MLFMA are also derived to further reduce the memory requirement Numerical results shown that the order of errors introduced by the matrix rotation is much smaller than the order of errors from the plain LF-MLFMA or plain static MLFMA. The implementation of the matrix technique does not change the precision order of the LF-MLFMA and the static MLFMA for fixed truncation of multipole expansions. (C) 2000 John Wiley & Sons, Inc.
Lagrange interpolation of the translation operator in the three-dimensional multilevel fast multipole algorithm (MLFMA) is revisited. Parameters of the interpolation, namely, the number of interpolation points and the...
详细信息
Lagrange interpolation of the translation operator in the three-dimensional multilevel fast multipole algorithm (MLFMA) is revisited. Parameters of the interpolation, namely, the number of interpolation points and the oversampling factor, are optimized for controllable error. Via optimization, it becomes possible to obtain the desired level of accuracy with the minimum processing time.
We consider the efficient solution of electromagnetics problems involving dielectric and composite dielectric-metallic structures, formulated with the electric and magnetic current combined-field integral equation (JM...
详细信息
We consider the efficient solution of electromagnetics problems involving dielectric and composite dielectric-metallic structures, formulated with the electric and magnetic current combined-field integral equation (JMCFIE). Dense matrix equations obtained from the discretization of JMCFIE with Rao-Wilton-Glisson functions are solved iteratively, where the matrix-vector multiplications are performed efficiently with the multilevel fast multipole algorithm. JMCFIE usually provides well conditioned matrix equations that are easy to solve iteratively. However, iteration counts and the efficiency of solutions depend on the contrast, i.e., the relative variation of electromagnetic parameters across dielectric interfaces. Owing to the numerical imbalance of off-diagonal matrix partitions, solutions of JMCFIE become difficult with increasing contrast. We present a four-partition block-diagonal preconditioner (4PBDP), which provides efficient solutions of JMCFIE by reducing the number of iterations significantly. 4PBDP is useful, especially when the contrast increases, and the standard block-diagonal preconditioner fails to provide a rapid convergence.
In multilevel fast multipole algorithm (MLFMA) the direct evaluation of the Rokhlin translator is computationally expensive, and usually the cost is lowered by using local Lagrange interpolation in the evaluation, whi...
详细信息
In multilevel fast multipole algorithm (MLFMA) the direct evaluation of the Rokhlin translator is computationally expensive, and usually the cost is lowered by using local Lagrange interpolation in the evaluation, which requires oversampling of the translator. In this paper we improve the interpolation procedure by introducing a new, accurate, and fast oversampling technique based on the fast Fourier transform (FFT). In addition to speeding up the oversampling this also allows the use of lower number of points in the interpolation stencils improving the efficiency of the evaluation of the Rokhlin translator. We have optimized the interpolation parameters, i.e., the number of the stencil points and the oversampling factor, by using as the error criterion the accuracy in the translated (incoming) field rather than the usually used interpolation error. This choice leads to better optimized parameter pairs which further lowers the interpolation cost. We have computed and tabulated the optimized pairs for a wide range of target accuracies and the MLFMA division levels. These tables can be used for a good error control and maximal speed-up in practical computations.
A parallel scheme that combines the OpenMP and the vector arithmetic logic unit (VALU) hardware acceleration is presented to speed up the multilevel fast multipole algorithm (MLFMA) on shared-memory computers. Perform...
详细信息
A parallel scheme that combines the OpenMP and the vector arithmetic logic unit (VALU) hardware acceleration is presented to speed up the multilevel fast multipole algorithm (MLFMA) on shared-memory computers. Performance of the hybrid parallel OpenMP-VALU MLFMA is investigated and several strategies are employed to improve the overall speedup and parallel efficiency. Effectiveness of the hybrid parallel scheme is verified by numerical results of the electromagnetic (EM) scattering examples, and related numerical stability issue is discussed as well.
We consider fast and efficient optimizations of arrays involving three-dimensional antennas with arbitrary shapes and geometries. Heuristic algorithms, particularly genetic algorithms, are used for optimizations, whil...
详细信息
We consider fast and efficient optimizations of arrays involving three-dimensional antennas with arbitrary shapes and geometries. Heuristic algorithms, particularly genetic algorithms, are used for optimizations, while the required solutions are carried out accurately and efficiently via the multilevel fast multipole algorithm(MLFMA). The superposition principle is employed to reduce the number of MLFMA solutions to the number of array elements per frequency. The developed mechanism is used to optimize arrays for multifrequency and/or multidirection operations, i.e., to find the most suitable set of antenna excitations for desired radiation characteristics simultaneously at different frequencies and/or directions. The capabilities of the optimization environment are demonstrated on arrays of bowtie and Vivaldi antennas.
We present a novel hierarchical partitioning strategy for the efficient parallelization of the multilevel fast multipole algorithm (MLFMA) on distributed-memory architectures to solve large-scale problems in electroma...
详细信息
We present a novel hierarchical partitioning strategy for the efficient parallelization of the multilevel fast multipole algorithm (MLFMA) on distributed-memory architectures to solve large-scale problems in electromagnetics. Unlike previous parallelization techniques, the tree structure of MLFMA is distributed among processors by partitioning both clusters and samples of fields at each level. Due to the improved load-balancing, the hierarchical strategy offers a higher parallelization efficiency than previous approaches, especially when the number of processors is large. We demonstrate the improved efficiency on scattering problems discretized with millions of unknowns. In addition, we present the effectiveness of our algorithm by solving very large scattering problems involving a conducting sphere of radius 210 wavelengths and a complicated real-life target with a maximum dimension of 880 wavelengths. Both of the objects are discretized with more than 200 million unknowns.
The computational error of the multilevel fast multipole algorithm is studied. The error convergence rate, achievable minimum error, and error bound are investigated for various element distributions. We will discuss ...
详细信息
The computational error of the multilevel fast multipole algorithm is studied. The error convergence rate, achievable minimum error, and error bound are investigated for various element distributions. We will discuss the boundary between the large and small buffer cases in terms of machine precision. The needed buffer size to reach double precision accuracy will be clarified.
We consider accurate full-wave solutions of plasmonic problems using the multilevel fast multipole algorithm (MLFMA). Metallic structures at optical frequencies are modeled by using the Lorentz-Drude model, formulated...
详细信息
We consider accurate full-wave solutions of plasmonic problems using the multilevel fast multipole algorithm (MLFMA). Metallic structures at optical frequencies are modeled by using the Lorentz-Drude model, formulated with surface integral equations, and analyzed iteratively via MLFMA. Among alternative choices, the electric and magnetic current combined-field integral equation (JMCFIE) and the combined tangential formulation (CTF), which are popular integral-equation formulations for penetrable objects, are discretized with the conventional Rao-Wilton-Glisson functions and used to model plasmonic structures. We discuss electromagnetic interactions in plasmonic media and show how far-field interactions may be omitted for improving the efficiency without sacrificing the accuracy of results. (c) 2016 Wiley Periodicals, Inc. Int J RF and Microwave CAE 26:335-341, 2016.
A many-core parallel approach of the multilevel fast multipole algorithm (MLFMA) based on the Athread parallel programming model is presented on the homegrown many-core SW26010 CPU of China. In the proposed many-core ...
详细信息
A many-core parallel approach of the multilevel fast multipole algorithm (MLFMA) based on the Athread parallel programming model is presented on the homegrown many-core SW26010 CPU of China. In the proposed many-core implementation of MLFMA, the data access efficiency is improved by using data structures based on the structure of array. The adaptive workload distribution strategies are adopted on different MLFMA tree levels to ensure full utilization of computing capability and the scratchpad memory. A double buffering scheme is specially designed to make communication overlapped computation. The resulting Athread-based many-core implementation of the MLFMA is capable of solving real-life problems with over one million unknowns with a remarkable speedup. The capability and efficiency of the proposed method are analyzed through the examples of computing scattering by spheres and a practical aerocraft. Numerical results show that with the proposed parallel scheme, the total speedup ratios from 6.4 to 8.0 can be achieved, compared with the CPU master core.
暂无评论