The coalition formation problem (CFP) is a crucial component of multi-agent systems (MAS), taking place in various areas in the real world with different variants. This study proposes a parallel metaheuristic algorith...
详细信息
The coalition formation problem (CFP) is a crucial component of multi-agent systems (MAS), taking place in various areas in the real world with different variants. This study proposes a parallel metaheuristic algorithm for CFP. Our hybrid method combines two metaheuristic algorithms: the Scatter Search and the Beam Search. While the former ensures that the algorithm thor-oughly explores the search space, the latter exploits the visited regions. We re-design Scatter Search's original implementation to perform the time-consuming independent areas of the task in parallel. We employ a perturbation mechanism inside the Beam Search that performs a big jump in the search space when it cannot find any improvement. Moreover, we design a problem-specific repre-sentation that stores meta-information to save significant computational time. The proposed method is examined in parallel and sequential configurations and compared with an exact solver, recent metaheuristic algorithms, and the standard implementation of the Scatter Search. The experimental results show that our solution achieves considerable improvements in both configurations.
Temporal graphs change with time and have a lifespan associated with each vertex and edge. These graphs are suitable to process time-respecting algorithms where the traversed edges must have monotonic timestamps. Inte...
详细信息
Temporal graphs change with time and have a lifespan associated with each vertex and edge. These graphs are suitable to process time-respecting algorithms where the traversed edges must have monotonic timestamps. Interval-centric Computing Model (ICM) is a distributed programming abstraction to design such temporal algorithms. There has been little work on supporting time-respecting algorithms at large scales for streaming graphs, which are updated continuously at high rates (Millions/s), such as in financial and social networks. In this article, we extend the windowed-variant of ICM for incremental computing over streaming graph updates. We formalize the properties of temporal graph algorithms and prove that our model of incremental computing over streaming updates is equivalent to batch execution of ICM. We design TARIS, a novel distributed graph platform that implements these incremental computing features. We use efficient data structures to reduce memory access and enhance locality during graph updates. We also propose scheduling strategies to interleave updates with computing, and streaming strategies to adapt the execution window for incremental computing to the variable input rates. Our detailed and rigorous evaluation of temporal algorithms on large-scale graphs with up to 2B edges show that TARIS out-performs contemporary baselines, Tink and Gradoop, by 3-4 orders of magnitude, and handles a high input rate of 83k-587 M Mutations/s with latencies in the order of seconds-minutes.
In this study, we present a novel interpolation scheme for chimera simulations in CFD that treats flows with discontinuities. This scheme is suitable for interpolation over polyhedral meshes using only scattered data ...
详细信息
In this study, we present a novel interpolation scheme for chimera simulations in CFD that treats flows with discontinuities. This scheme is suitable for interpolation over polyhedral meshes using only scattered data and without the use of stencils. During the interpolation data transfer among overset meshes, both solution variables and their gradients are communicated and then used for the interpolation process. First, the overset topology problem in partitioned polyhedral meshes is addressed, and then a new interpolation algorithm for generally discontinuous fields is introduced. Then, we describe how the gradient approximation of the Finite Volume (FV) method is utilized in order to construct bounded approximations on values on the direction of discontinuities and enhance the accuracy of the low order Nearest Neighbor value (NNV) algorithm. The performance of the proposed algorithm is quantified and validated in various test cases, together with a comparison with NNV. Finally, scalability tests are presented to prove computational efficiency. The method is proved to be highly accurate in propagation cases and performs well in unsteady two-phase problems executed using parallel architectures.
The article presents a mathematical research which develops and examines the properties of high-performance smoothed particles hydrodynamics (SPH)-based algorithms for solving continuum mechanics problems in central p...
详细信息
The article presents a mathematical research which develops and examines the properties of high-performance smoothed particles hydrodynamics (SPH)-based algorithms for solving continuum mechanics problems in central processing unit (CPU) and hybrid architectures. Details include the advantages of using SPH, estimates of errors in the case of SPH-type approximations, and the selection of parameters. Also mentioned are parallel algorithms for SPH, including the computation time and acceleration.
Finding cohesive subgraphs is a crucial graph analysis kernelwidely used for social and biological networks (graphs). There exist various approaches for discovering insightful substructures in a network, such as findi...
详细信息
ISBN:
(纸本)9798400708435
Finding cohesive subgraphs is a crucial graph analysis kernelwidely used for social and biological networks (graphs). There exist various approaches for discovering insightful substructures in a network, such as finding cliques, community discovery, and truss decomposition. Finding cliques is a computationally intractable problem, making it difficult to identify cohesive subgraphs in large graphs. One possible solution is k-truss decomposition, which is a relaxed form of finding cliques that can be solved in polynomial time. Further, unlike global community detection-which focuses on breaking down the entire graph into disjoint communities-a local or goaloriented community search aims at finding the community of an entity of interest. In this work, we identify a k-truss-induced community discovery technique that can detect local communities in polynomial time. However, most previous studies have explored k-truss-induced local community formation in a serial setting, making them unsuitable for large graphs. In this paper, we design a parallel k-truss-induced local community construction method using multi-core parallelism. To the best of our knowledge, this is the first attempt to parallelize this algorithmic approach with extensive performance analysis. Our experiments demonstrate a significant performance improvement, with speedups from 19x to 55x for graphs with hundreds of millions to billions of edges, using NERSC Perlmutter compute nodes.
The paper proposes a new testing technique for concurrent programs. The technique is a specification-based testing. For a formal specification S and a concurrent program P, state sequences are generated from P and che...
详细信息
The paper proposes a new testing technique for concurrent programs. The technique is a specification-based testing. For a formal specification S and a concurrent program P, state sequences are generated from P and checked to be accepted by S. We suppose that S is specified in Maude and P is implemented in Java. Java Pathfinder (JPF) and Maude are then used to generate state sequences from P and to check if such state sequences are accepted by S, respectively. Even without checking any property violations with JPF, JPF often encounters the notorious state space explosion while only generating state sequences. Thus, we propose a technique to generate state sequences from P and check if such state sequences are accepted by S in a stratified way. A tool is developed to support the proposed technique that can be processed naturally in parallel. Some experiments demonstrate that the proposed technique mitigates the state space explosion, which cannot be achieved with the straightforward use of JPF.
In this paper, we present a new parallel accurate algorithm called PAccSumK for computing summation of floating-point numbers. It is based on AccSumK algorithm. In the experiment, for the summation problems with large...
详细信息
In this paper, we present a new parallel accurate algorithm called PAccSumK for computing summation of floating-point numbers. It is based on AccSumK algorithm. In the experiment, for the summation problems with large condition numbers, our algorithm outperforms the PSumK algorithm in terms of accuracy and computing time. The reason is that our algorithm is based on a more accurate algorithm called AccSumK algorithm compared to the SumL algorithm used in PSumK. The proposed parallel algorithm in this paper is designed to compute a result as if computed internally in K-fold the working precision. Numerical results are presented showing the performance and the accuracy of our new parallel algorithm for calculating summation. (c) 2021 Elsevier B.V. All rights reserved.
We present BiqBin, an exact solver for linearly constrained binary quadratic problems. Our approach is based on an exact penalty method to first efficiently transform the original problem into an instance of Max-Cut, ...
详细信息
We present BiqBin, an exact solver for linearly constrained binary quadratic problems. Our approach is based on an exact penalty method to first efficiently transform the original problem into an instance of Max-Cut, and then to solve the Max-Cut problem by a branch-and-bound algorithm. All the main ingredients are carefully developed using new semidefinite programming relaxations obtained by strengthening the existing relaxations with a set of hypermetric inequalities, applying the bundle method as the bounding routine and using new strategies for exploring the branch-and-bound tree. Furthermore, an efficient C implementation of a sequential and a parallel branch-and-bound algorithm is presented. The latter is based on a load coordinator-worker scheme using MPI for multi-node parallelization and is evaluated on a high-performance computer. The new solver is benchmarked against BiqCrunch, GUROBI, and SCIP on four families of (linearly constrained) binary quadratic problems. Numerical results demonstrate that BiqBin is a highly competitive solver. The serial version outperforms the other three solvers on the majority of the benchmark instances. We also evaluate the parallel solver and show that it has good scaling properties. The general audience can use it as an on-line service available at http://***.
Motivated by challenges in the Earth's mantle convection, we present a massively parallel implementation of an Eulerian-Lagrangian method for the advection-diffusion equation in the advection-dominated regime. The...
详细信息
Motivated by challenges in the Earth's mantle convection, we present a massively parallel implementation of an Eulerian-Lagrangian method for the advection-diffusion equation in the advection-dominated regime. The advection term is treated by a particle-based characteristics method coupled to a block-structured finite element framework. Its numerical and computational performance is evaluated in multiple two- and three-dimensional benchmarks, including curved geometries, discontinuous solutions, and pure advection, and it is applied to a coupled nonlinear system modeling buoyancy-driven convection in Stokes flow. We demonstrate the parallel performance in a strong and weak scaling experiment, with scalability to up to 147,456 parallel processes, solving for more than 5.2 x 10(10) (52 billion) degrees of freedom per time-step.
We present a stochastic method for efficiently computing the solution of time -fractional partial differential equations (fPDEs) that model anomalous diffusion problems of the subdiffusive type. After discretizing the...
详细信息
We present a stochastic method for efficiently computing the solution of time -fractional partial differential equations (fPDEs) that model anomalous diffusion problems of the subdiffusive type. After discretizing the fPDE in space, the ensuing system of fractional linear equations is solved resorting to a Monte Carlo evaluation of the corresponding Mittag-Leffler matrix function. This is accomplished through the approximation of the expected value of a suitable multiplicative functional of a stochastic process, which consists of a Markov chain whose sojourn times in every state are Mittag-Leffler distributed. The resulting algorithm is able to calculate the solution at conveniently chosen points in the domain with high efficiency. In addition, we present how to generalize this algorithm in order to compute the complete solution. For several large-scale numerical problems, our method showed remarkable performance in both shared -memory and distributed -memory systems, achieving nearly perfect scalability up to 16, 384 CPU cores.
暂无评论