Given the permutation routing problem on mesh-connected arrays with a known maximum distance, d, between any source-destination pair, we show how sorting and the greedy algorithm can be combined to yield a determinist...
详细信息
Given the permutation routing problem on mesh-connected arrays with a known maximum distance, d, between any source-destination pair, we show how sorting and the greedy algorithm can be combined to yield a deterministic, asymptotically optimal algorithm for solving the problem. This simple algorithm runs in d + O(d/f(d)) time and requires an O(f(d)) buffer size (or O(d) time and constant buffer size if we choose f(d) to be a constant). It also gives efficient solutions to the k-k routing problem with locality.
Nowadays, the fractal is used widely everywhere. Then, its creating time becomes an important study area for complex iteration functions because the escape-time algorithm(ETA), which is the most used algorithmin fract...
详细信息
Nowadays, the fractal is used widely everywhere. Then, its creating time becomes an important study area for complex iteration functions because the escape-time algorithm(ETA), which is the most used algorithmin fractal creating, performs not so well in this condition. In this paper, in order to solve this problem, we improve ETA into the parallel environment and reach well performance. At first, we provide a separationmethod of ETA to reformit into a SIMC-MC2 grid. Secondly, we prove its correctness and compute the complexity of this novel parallel algorithm. Meantime, we separate an improved ETA which we have presented into the same parallel environment and compute its complexity. Additionally, theoretical and experimental results show the characteristics of this novel algorithm. Finally, the computational result shows that a novel environment is needed to decrease large manual allocation strategies, which block the improved benefit.
We consider an unusual shear layer occuring between two parallel Couette flows. Contrary to the classical free shear layer, the width of the shear zone does not vary in the streamwise direction but rather exhibits a l...
详细信息
We consider an unusual shear layer occuring between two parallel Couette flows. Contrary to the classical free shear layer, the width of the shear zone does not vary in the streamwise direction but rather exhibits a lateral variation. Based on some simplifying assumptions, an analytic solution is derived for this shear layer. These assumptions are justified by a comparison with numerical solutions of the full Navier-Stokes equations, which accord with the analytical solution to better than 1% in the entire domain. An explicit formula is found for the width of the shear zone as a function of the wall-normal coordinate. This width is independent of the wall velocities in the laminar regime. Preliminary results for a cocurrent laminar-turbulent shear layer in the same geometry are also presented. Shear-layer instabilities are then developed and result in an unsteady mixing zone at the interface between the two cocurrent streams.
In this paper, we present a new parallel accurate algorithm called PAccSumK for computing summation of floating-point numbers. It is based on AccSumK algorithm. In the experiment, for the summation problems with large...
详细信息
In this paper, we present a new parallel accurate algorithm called PAccSumK for computing summation of floating-point numbers. It is based on AccSumK algorithm. In the experiment, for the summation problems with large condition numbers, our algorithm outperforms the PSumK algorithm in terms of accuracy and computing time. The reason is that our algorithm is based on a more accurate algorithm called AccSumK algorithm compared to the SumL algorithm used in PSumK. The proposed parallel algorithm in this paper is designed to compute a result as if computed internally in K-fold the working precision. Numerical results are presented showing the performance and the accuracy of our new parallel algorithm for calculating summation. (c) 2021 Elsevier B.V. All rights reserved.
The advancement of the powertrain control increases the amount of computation. Mass production ECU (Electronic Control Unit), which is made of single-core architecture, cannot have a higher clock speed. Using multi/ma...
详细信息
We present Talbot Suite, a C parallel software collection for the numerical inversion of Laplace Transforms, based on Talbot's method. It is designed to fit both single and multiple Laplace inversion problems, whi...
详细信息
We present Talbot Suite, a C parallel software collection for the numerical inversion of Laplace Transforms, based on Talbot's method. It is designed to fit both single and multiple Laplace inversion problems, which arise in several application and research fields. In our software, we achieve high accuracy and efficiency, making full use of modern architectures and introducing two different levels of parallelism: coarse and fine grained parallelism. They offer a reasonable tradeoff between accuracy, the main aspect for a few inversions, and efficiency, the main aspect for multiple inversions. To take into account modern high-performance computing architectures, Talbot Suite provides different software versions: an OpenMP-based version for shared memory machines and a MPI-based version for distributed memory machines. Moreover, oriented to hybrid architectures, a combined MPI/OpenMP-based implementation is provided too. We describe our parallel algorithms and the software organization. We also report some performance results. Our software includes sample programs to call the Talbot Suite functions from C and from MATLAB.
Motivated by challenges in the Earth's mantle convection, we present a massively parallel implementation of an Eulerian-Lagrangian method for the advection-diffusion equation in the advection-dominated regime. The...
详细信息
Motivated by challenges in the Earth's mantle convection, we present a massively parallel implementation of an Eulerian-Lagrangian method for the advection-diffusion equation in the advection-dominated regime. The advection term is treated by a particle-based characteristics method coupled to a block-structured finite element framework. Its numerical and computational performance is evaluated in multiple two- and three-dimensional benchmarks, including curved geometries, discontinuous solutions, and pure advection, and it is applied to a coupled nonlinear system modeling buoyancy-driven convection in Stokes flow. We demonstrate the parallel performance in a strong and weak scaling experiment, with scalability to up to 147,456 parallel processes, solving for more than 5.2 x 10(10) (52 billion) degrees of freedom per time-step.
Computing distance fields is fundamental to many scientific and engineering applications. Distance fields can be used to direct analysis and reduce data. In this paper, we present a highly scalable method for computin...
详细信息
Computing distance fields is fundamental to many scientific and engineering applications. Distance fields can be used to direct analysis and reduce data. In this paper, we present a highly scalable method for computing 3D distance fields on massively parallel distributed-memory machines. A new distributed spatial data structure, named parallel distance tree, is introduced to manage the level sets of data and facilitate surface tracking over time, resulting in significantly reduced computation and communication costs for calculating the distance to the surface of interest from any spatial locations. Our method supports several data types and distance metrics from real-world applications. We demonstrate its efficiency and scalability on state-of-the-art supercomputers using both large-scale volume datasets and surface models. We also demonstrate in-situ distance field computation on dynamic turbulent flame surfaces for a petascale combustion simulation. Our work greatly extends the usability of distance fields for demanding applications.
A force field formulator for organic molecules (FF-FOM) was developed to assign bond, angle, and dihedral parameters to arbitrary organic molecules in a unified manner including proteins and nucleic acids. With the un...
详细信息
A force field formulator for organic molecules (FF-FOM) was developed to assign bond, angle, and dihedral parameters to arbitrary organic molecules in a unified manner including proteins and nucleic acids. With the unified force field parametrization we performed massively parallel computations of absolute binding free energies for pharmaceutical target proteins and ligands. Compared with the previous calculation with the ff99 force field in the Amber simulation package (Amber99) and the ligand charges produced by the Austin Model 1 bond charge correction (AM1-BCC), the unified parametrization gave better absolute binding energies for the FK506 binding protein (FKBP) and ligand system. Our method is based on extensive work measurement between thermodynamic states to calculate the free energy difference and it is also the same as the traditional free energy perturbation. There are important requirements for accurate calculations. The first is a well-equilibrated bound structure including the conformational change of the protein induced by the binding of the ligand. The second requirement is the convergence of the work distribution with a sufficient number of trajectories and dense spacing of the coupling constant between the ligand and the rest of the system. Finally, the most important requirement is the force field parametrization.
暂无评论