In this paper, the Laplace transform combined with the local discontinuous Galerkin method is used for distributed-order time-fractional diffusion-wave equation. In this method, at first, we convert the equation to so...
详细信息
In this paper, the Laplace transform combined with the local discontinuous Galerkin method is used for distributed-order time-fractional diffusion-wave equation. In this method, at first, we convert the equation to some time-independent problems by Laplace transform. Then, we solve these stationary equations by the local discontinuous Galerkin method to discretize diffusion operators at the same time. Next, by using a numerical inversion of the Laplace transform, we find the solution of the original equation. One of the advantages of this procedure is its capability to be implemented in a parallel environment. It's another advantage is that the number of stationary problems that should be solved is much less than that is needed in time-marching methods. Finally, some numerical experiments have been provided to show the accuracy and efficiency of the method.
Constructing efficient algorithms for graph problems is a fundamental problem in computer science *** dissertation studies algorithms for large scale graphs and focuses on directed graph *** consider directed graph pr...
详细信息
Constructing efficient algorithms for graph problems is a fundamental problem in computer science *** dissertation studies algorithms for large scale graphs and focuses on directed graph *** consider directed graph problems in several models of computation with the goal of improving performance for large scale graphs. The models of computation include the external memory model, the work-span model for parallel algorithms, and the asymmetric RAM *** hypothesis is there exist provably efficient algorithms for directed graphs problems for large scale graphs. The following results are presented. First, an I/O-efficient algorithm for topologically sorting a directed acyclic graph, as well as an algorithm for identifying and topologically sorting the strongly connected components of a directed graph. Both algorithms cost O(E/B log M/B V/B cdot log 4 V) I/Os and are the first I/O-efficient algorithms for these problems for sparse ***, this work shows an algorithm for constructing a (1+ epsilon) -approximate directed hopset of size Tilde O (n) in Tilde O (m) work and n 1/2+o(1) hopbound. Our parallel version of the algorithm can be used to solve parallel approximate single-source shortest paths in Tilde O (m) work and n 1/2+o(1) *** we show a parallel algorithm for distance-limited shortest paths on directed acyclic graphs with 0 and -1 edge weights. The algorithm computes single-source shortest paths for nodes with distance at least -L , and runs in Tilde O (m) work and O(L 1/2 n 1/2 + o(1) ) ***, we give write-efficient algorithms for breadth-first search, depth-first search and strongly connected components. The standard RAM algorithms for these problems write the solution for each node. Instead, we write only a subset of the nodes to save writes, at the expense of more reads to answer a query. Our result is sublinear size data structures that can answer queries for each of the three graph problems.
Point density is an important property that dictates the usability of a point cloud data set. This paper introduces an efficient, scalable, parallel algorithm for computing the local point density index, a sophisticat...
详细信息
Point density is an important property that dictates the usability of a point cloud data set. This paper introduces an efficient, scalable, parallel algorithm for computing the local point density index, a sophisticated point cloud density metric. Computing the local point density index is non-trivial, because this computation involves a neighbour search that is required for each, individual point in the potentially large, input point cloud. Most existing algorithms and software are incapable of computing point density at scale. Therefore, the algorithm introduced in this paper aims to address both the needed computational efficiency and scalability for considering this factor in large, modern point clouds such as those collected in national or regional scans. The proposed algorithm is composed of two stages. In stage 1, a point-level, parallel processing step is performed to partition an unstructured input point cloud into partially overlapping, buffered tiles. A buffer is provided around each tile so that the data partitioning does not introduce spatial discontinuity into the final results. In stage 2, the buffered tiles are distributed to different processors for computing the local point density index in parallel. That tile-level parallel processing step is performed using a conventional algorithm with an R-tree data structure. While straight-forward, the proposed algorithm is efficient and particularly suitable for processing large point clouds. Experiments conducted using a 1.4 billion point data set acquired over part of Dublin, Ireland demonstrated an efficiency factor of up to 14.8/16. More specifically, the computational time was reduced by 14.8 times when the number of processes (i.e. executors) increased by 16 times. Computing the local point density index for the 1.4 billion point data set took just over 5 minutes with 16 executors and 8 cores per executor. The reduction in computational time was nearly 70 times compared to the 6 hours required without
The multiwindow discrete Gabor transform (M-DGT) is an effective time-frequency analysis tool to analyse time-varying signals containing components with multiple frequencies. In this study, fast block time-recursive m...
详细信息
The multiwindow discrete Gabor transform (M-DGT) is an effective time-frequency analysis tool to analyse time-varying signals containing components with multiple frequencies. In this study, fast block time-recursive methods for computing the M-DGT coefficients of a signal and the reconstruction of the signal from the transform coefficients are presented with steps as listed, respectively, in algorithms 1 and 2, and their implementations using unified parallel lattice structures are also given. The proposed algorithms consisting of algorithms 1 and 2 for respective forward and inverse transforms are compared to (i) those of the existing serial algorithms in terms of computational complexity and time;and (ii) those of the existing parallel algorithms in terms of hardware complexity. The results indicate that the proposed algorithm is fast in computing M-DGT coefficients of a signal and reconstructing the signal with a reduced hardware complexity.
Networks are very important in the world. In signal processing, the towers are modeled as nodes (vertices) and if two towers communicate, then they have an arc (edge) between them or precisely, they are adjacent. The ...
详细信息
High Performance Computing (HPC) systems are employed to solve hard problems and rely on parallel algorithms which present very long execution times-up to several days. These systems are expensive in terms of the comp...
详细信息
When analyzing power systems, it is often desirable to visualize the network of buses and branches. Here, a new algorithm for producing 2-D network layouts is proposed. The method consists of two steps: first, a matri...
详细信息
When analyzing power systems, it is often desirable to visualize the network of buses and branches. Here, a new algorithm for producing 2-D network layouts is proposed. The method consists of two steps: first, a matrix of desired distances between all bus-pairs is computed based on base voltages and branch reactances and, second, coordinates that minimize the errors between desired and actual distances are found. The parallelization used in the latter step is particularly beneficial for interpreted languages;it is shown that layouts for relatively large systems (a few thousand buses) can be produced within seconds on a standard laptop computer using Python or Matlab. Predefined coordinates for selected buses can optionally be given as input. This can be useful, e.g., when one wants to retain some geographical aspects of the system or wish to compare a full and reduced network model. Although the focus here is on power systems, the algorithm can also be used for other types of networks.
We propose a decomposition framework for the parallel optimization of the sum of a differentiable (possibly nonconvex) function and a nonsmooth (separable), convex one. The latter term is usually employed to enforce s...
详细信息
We propose a decomposition framework for the parallel optimization of the sum of a differentiable (possibly nonconvex) function and a nonsmooth (separable), convex one. The latter term is usually employed to enforce structure in the solution, typically sparsity. The main contribution of this work is a novel parallel, hybrid random/deterministic decomposition scheme wherein, at each iteration, a subset of (block) variables is updated at the same time by minimizing local convex approximations of the original nonconvex function. To tackle with huge-scale problems, the (block) variables to be updated are chosen according to a mixed random and deterministic procedure, which captures the advantages of both pure deterministic and random update-based schemes. Almost sure convergence of the proposed scheme is established. Numerical results on huge-scale problems show that the proposed algorithm outperforms current schemes.
In order to improve the computational efficiency of DPM(Decomposition Projective Method) based on orthogonal complement space, this paper has reformed the computing process of the above mothed as a parallel algorithm....
详细信息
Speedup and efficiency of two parallel algorithms for calculating the dynamics of the current distribution when the surface of a tungsten sample is heated by an electron beam pulse are presented. The algorithms are im...
详细信息
暂无评论