Collision detection of a large number N of particles can be challenging. Directly testing N particles for collisions among each other leads to N-2 queries. Especially in scenarios, where fast, densely packed particles...
详细信息
Collision detection of a large number N of particles can be challenging. Directly testing N particles for collisions among each other leads to N-2 queries. Especially in scenarios, where fast, densely packed particles interact, challenges arise for classical methods like Particle-in-Cell or Monte-Carlo. Modern collision detection methods utilising bounding volume hierarchies are suitable to overcome these challenges and allow a detailed analysis of the interaction of large number of particles. This approach is applied to the analysis of the collision of two photon beams leading to the creation of electron-positron pairs. (c) 2017 Elsevier Inc. All rights reserved.
The highly scalable parallel tree code PEPC for rapid computation of long-range (1/r) Coulomb forces is presented. It can be used as a library for applications involving electrostatics or Newtonian gravity in 3D. The ...
详细信息
The highly scalable parallel tree code PEPC for rapid computation of long-range (1/r) Coulomb forces is presented. It can be used as a library for applications involving electrostatics or Newtonian gravity in 3D. The code is based on the hashed oct-tree algorithm, in which particle coordinates are projected onto a space-filling curve prior to sorting and construction of multipole moments. However, standard particle sorting techniques can ultimately limit the scalability of such algorithms for thousands of cores, a bottleneck which can be alleviated by a recursive sort scheme specially adapted to the Morton curve. More serious limitations of the original locally essential tree concept of Salmon and Warren, which ultimately lead to a failure in memory scaling, are identified and analyzed rigorously. Benchmarks for the code on the IBM Blue Gene/P Jugene are presented which demonstrate scaling for multi-million particle systems on up to 8192 cores. (C) 2011 Elsevier B.V. All rights reserved.
The Matern family of functions is a widely used covariance kernel in spatial statistics for Gaussian process modeling, which in many instances requires calculations with a covariance matrix. In this paper, we design a...
详细信息
The Matern family of functions is a widely used covariance kernel in spatial statistics for Gaussian process modeling, which in many instances requires calculations with a covariance matrix. In this paper, we design a fast summation algorithm for the Matern kernel in order to efficiently perform matrix-vector multiplications. This algorithm is based on the Barnes-Hut tree code framework and addresses several practical issues: the anisotropy of the kernel, the nonuniform distribution of the point set, and a tight error estimate of the approximation. Even though the algorithmic details differ from the standard tree code in several aspects, empirically the computational cost of our algorithm scales as O(n log n) for n points. Comprehensive numerical experiments are shown to demonstrate the practicality of the design.
The mathematical formulation of gravitational lensing - the lens equation - is a very simple mapping R-2 --> R-2, between the lens (or sky) plane and the source plane. This approximation assumes that all the deflec...
详细信息
The mathematical formulation of gravitational lensing - the lens equation - is a very simple mapping R-2 --> R-2, between the lens (or sky) plane and the source plane. This approximation assumes that all the deflecting matter is in one plane. In this case the deflection angle a is just the sum over all mass elements in the lens plane. For certain problems - like the determination of the magnification of sources over a large number of source positions (up to 10(10)) for very many lenses (up to 10(6)) - straightforward techniques for the determination of the deflection angle are far too slow. We implemented an algorithm that includes a two-dimensional tree-code plus a multipole expansion in order to make such microlensing simulations "inexpensive". Subsequently we modified this algorithm such that it could be applied to a three-dimensional mass distribution that fills the universe (approximated by many lens planes), in order to determine the imaging properties of cosmological lens simulations. Here we describe the techniques and the numerical methods, and we mention a few astrophysical results obtained with these methods. (C) 1999 Elsevier Science B.V. All rights reserved.
The computation time for a particle simulation where long-reaching forces, such as electrostatic forces for charged particles, have to be taken into account, rises asymptotically with O (N(2)). A significant reduction...
详细信息
The computation time for a particle simulation where long-reaching forces, such as electrostatic forces for charged particles, have to be taken into account, rises asymptotically with O (N(2)). A significant reduction of computation time can be achieved by applying a tree code algorithm. The algorithm exploits the idea of grouping suitable particles to macroparticles thus reducing the number of necessary calculations within a reasonably limited error bound. Although this algorithm was developed for astronomical galaxy simulations, it can be also used for simulations of microscopical particle with finite radii. The tree code can be used for an efficient collision routine if some additional boundary conditions are taken into account. This results in an asymptotic computation time dependence of O (N log N).
The tree method is a widely implemented algorithm for collisionless N-body simulations in astrophysics well suited for GPU(s). Adopting hierarchical time stepping can accelerate N-body simulations;however, it is infre...
详细信息
The tree method is a widely implemented algorithm for collisionless N-body simulations in astrophysics well suited for GPU(s). Adopting hierarchical time stepping can accelerate N-body simulations;however, it is infrequently implemented and its potential remains untested in GPU implementations. We have developed a Gravitational Oct-tree code accelerated by Hierarchical time step Controlling named GOTHIC, which adopts both the tree method and the hierarchical time step. The code adopts some adaptive optimizations by monitoring the execution time of each function on-the-fly and minimizes the time-to solution by balancing the measured time of multiple functions. Results of performance measurements with realistic particle distribution performed on NVIDIA Tesla M2090, K20X, and GeForce GTX TITAN X, which are representative GPUs of the Fermi, Kepler, and Maxwell generation of GPUs, show that the hierarchical time step achieves a speedup by a factor of around 3-5 times compared to the shared time step. The measured elapsed time per step of GOTHIC is 0.30 s or 0.44 s on GTX TITAN X when the particle distribution represents the Andromeda galaxy or the NFW sphere, respectively, with 2(24) = 16,777,216 particles. The averaged performance of the code corresponds to 10-30% of the theoretical single precision peak performance of the GPU. (C) 2016 The Authors. Published by Elsevier B.V.
We reduce the problem of constructing asymptotically good tree codes to the construction of triangular totally nonsingular matrices over fields with polynomially many elements. We show a connection of this problem to ...
详细信息
We reduce the problem of constructing asymptotically good tree codes to the construction of triangular totally nonsingular matrices over fields with polynomially many elements. We show a connection of this problem to Birkhoff interpolation in finite fields. (C) 2015 Elsevier Inc. All rights reserved.
Motivated by a concept studied in [1], we consider a property of matrices over finite fields that generalizes triangular totally nonsingular matrices to block matrices. We show that (1) matrices with this property suf...
详细信息
Motivated by a concept studied in [1], we consider a property of matrices over finite fields that generalizes triangular totally nonsingular matrices to block matrices. We show that (1) matrices with this property suffice to construct asymptotically good tree codes and (2) a random block-triangular matrix over a field of quadratic size satisfies this property. We will also show that a generalization of this randomized construction yields codes over quadratic size fields for which the sum of the rate and minimum relative distance gets arbitrarily close to 1. (C) 2021 Elsevier B.V. All rights reserved.
A novel code for the approximate computation of long-range forces between N mutually interacting bodies is presented. The code is based on a hierarchical tree of cubic cells and features mutual cell-cell interactions ...
详细信息
A novel code for the approximate computation of long-range forces between N mutually interacting bodies is presented. The code is based on a hierarchical tree of cubic cells and features mutual cell-cell interactions which are calculated via a Cartesian Taylor expansion in a symmetric way. such that total momentum is conserved. The code benefits from an improved and simple multipole acceptance criterion that reduces the force error and the computational effort. For N greater than or equal to 10(4) the computational costs are found empirically to rise sublinearly with N. For applications in stellar dynamics, this is the first competitive code with complexity O(N): it is faster than the standard tree code by a factor of 10 or more. (C) 2002 Elsevier Science (USA).
Let the input to a computation problem be split between two processors connected by a communication link;and let an interactive protocol pi be known by which, on any input, the processors can solve the problem using n...
详细信息
Let the input to a computation problem be split between two processors connected by a communication link;and let an interactive protocol pi be known by which, on any input, the processors can solve the problem using no more than T transmissions of bits between them, provided the channel is noiseless in each direction, We study the following question: If in fact the channel is noisy, what is the effect upon the number of transmissions needed in order to solve the computation problem reliably? Technologically this concern is motivated by the increasing importance of communication as a resource in computing, and by the tradeoff in communications equipment between bandwidth, reliability, and expense. We treat a model with random channel noise. We describe a deterministic method for simulating noiseless-channel protocols on noisy channels, with only a constant slowdown. This is an analog for general interactive protocols of Shannon's coding theorem, which deals only with data transmission, i.e., one-way protocols. We cannot use Shannon's block coding method because the bits exchanged in the protocol are determined only one at a time, dynamically, in the course of the interaction, Instead, we describe a simulation protocol using a new kind of code, explicit tree codes.
暂无评论