The paper presents parallel algorithms for calculating the exact value of all-terminal reliability of a network with unreliable edges and absolutely reliable nodes. A random graph is used as a model of such network. T...
详细信息
Back-Projection is the major algorithm in Computed Tomography to reconstruct images from a set of recorded projections. It is used for both fast analytical methods and high-quality iterative techniques. X-ray imaging ...
详细信息
Back-Projection is the major algorithm in Computed Tomography to reconstruct images from a set of recorded projections. It is used for both fast analytical methods and high-quality iterative techniques. X-ray imaging facilities rely on Back-Projection to reconstruct internal structures in material samples and living organisms with high spatial and temporal resolution. Fast image reconstruction is also essential to track and control processes under study in real-time. In this article, we present efficient implementations of the Back-Projection algorithm for parallel hardware. We survey a range of parallel architectures presented by the major hardware vendors during the last 10 years. Similarities and differences between these architectures are analyzed and we highlight how specific features can be used to enhance the reconstruction performance. In particular, we build a performance model to find hardware hotspots and propose several optimizations to balance the load between texture engine, computational and special function units, as well as different types of memory maximizing the utilization of all GPU subsystems in parallel. We further show that targeting architecture-specific features allows one to boost the performance 2-7 times compared to the current state-of-the-art algorithms used in standard reconstructions codes. The suggested load-balancing approach is not limited to the back-projection but can be used as a general optimization strategy for implementing parallel algorithms.
In this paper, the Laplace transform combined with the local discontinuous Galerkin method is used for distributed-order time-fractional diffusion-wave equation. In this method, at first, we convert the equation to so...
详细信息
In this paper, the Laplace transform combined with the local discontinuous Galerkin method is used for distributed-order time-fractional diffusion-wave equation. In this method, at first, we convert the equation to some time-independent problems by Laplace transform. Then, we solve these stationary equations by the local discontinuous Galerkin method to discretize diffusion operators at the same time. Next, by using a numerical inversion of the Laplace transform, we find the solution of the original equation. One of the advantages of this procedure is its capability to be implemented in a parallel environment. It's another advantage is that the number of stationary problems that should be solved is much less than that is needed in time-marching methods. Finally, some numerical experiments have been provided to show the accuracy and efficiency of the method.
This paper proposed an event-triggered framework to solve network congestions caused by microgrids (MGs) in regional distributed networks. Two processes are included in this framework: congestion validation process an...
详细信息
A parallel algorithm for solving the 2D shallow water equations coupled with the convection-diffusion equation has been developed, in order to demonstrate the capability and performance of our parallel approach while ...
详细信息
We present a randomized O(m log^2 n) work, O(polylog n) depth parallel algorithm for minimum cut. This algorithm matches the work bounds of a recent sequential algorithm by Gawrychowski, Mozes, and Weimann [ICALP'...
详细信息
Triangle listing is an important topic in many practical applications. We have observed that this problem has not yet been studied systematically in the context of batch-dynamic graphs. In this paper, we aim to fill t...
详细信息
The aim of this article is to show that solvers for tridiagonal Toeplitz systems of linear equations can be efficiently implemented for a variety of modern GPU-accelerated and multicore architectures using OpenACC. We...
详细信息
Making full use of a sequential Delaunay-AFT mesher, a parallel method for the generation of large-scale tetrahedral meshes on distributed-memory machines is developed. To generate meshes with the required and the pre...
详细信息
Making full use of a sequential Delaunay-AFT mesher, a parallel method for the generation of large-scale tetrahedral meshes on distributed-memory machines is developed. To generate meshes with the required and the preserved properties, a Delaunay-AFT based domain decomposition (DD) technique is employed. Starting from the Delaunay triangulation (DT) covering the problem domain, this technique creates a layer of elements dividing the domain into several zones. The initially coarsely meshed domain is partitioned into DTs of subdomains which can be meshed in parallel. When the size of a subdomain is smaller than a user-specified threshold, it will be meshed with the standard Delaunay-AFT mesher. A two-level DD strategy is designed to improve the parallel efficiency of this algorithm. A dynamic load balancing scheme is also implemented using the Message Passing Interface (MPI). Out-of-core meshing is introduced to accommodate excessive large meshes that cannot be handled by the available memory of the computer (RAM). Numerical tests are performed for various complex geometries with thousands of surface patches. Ultra-large-scale meshes with more than ten billion tetrahedral elements have been created. Moreover, the meshes generated with different numbers of DD operations are nearly identical in quality: showing the consistency and the stability of the automatic decomposition algorithm. (C) 2019 Elsevier Ltd. All rights reserved.
The paper introduces a novel model of parallel metaheuristic optimization algorithms. The hierarchical graph model of a parallel optimization algorithm is proposed. It consists of the model for a parallel optimization...
详细信息
The paper introduces a novel model of parallel metaheuristic optimization algorithms. The hierarchical graph model of a parallel optimization algorithm is proposed. It consists of the model for a parallel optimization algorithm at the top level of the hierarchy and the model for a sequential optimization algorithm at the bottom level. The unified representation of a metaheuristic optimization algorithm, which allows representing a class of metaheuristic algorithms, is used. The extension of the proposed model to the parametric hierarchical model is proposed. Graph model transformations for a parallel algorithm analysis and synthesis are introduced. The representation of several metaheuristic algorithms with the proposed model is discussed. (C) 2019 The Authors. Published by Elsevier B.V.
暂无评论