检索结果-内蒙古大学图书馆

A Shared-Memory parallel Algorithm for Updating Single-Source Shortest Paths in Large Dynamic Networks

学校读者我要写书评

暂无评论

A Shared-Memory Parallel Algorithm for Updating Single-Sourc...

International Conference on High Performance Computing

作者： Sriram Srinivasan Sara Riazi Boyana Norris Sajal K. Das Sanjukta Bhowmick University of Nebraska Omaha Omaha NE USA University of Oregon Eugene OR University of Oregon Eugene Eugene OR Missouri University of Science and Technology Rolla Rolla MO USA University of North Texas Denton Texas Usa

ISBN: (纸本)9781538683873;9781538683866

Computing the single-source shortest path (SSSP) is one of the fundamental graph algorithms, and is used in many applications. Here, we focus on computing SSSP on large dynamic graphs, i.e. graphs whose structure evolves with time. We posit that instead of recomputing the SSSP for each set of changes on the dynamic graphs, it is more efficient to update the results based only on the region of change. To this end, we present a novel two-step shared-memory algorithm for updating SSSP on weighted large-scale graphs. The key idea of our algorithm is to identify changes, such as vertex/edge addition and deletion, that affect the shortest path computations and update only the parts of the graphs affected by the change. We provide the proof of correctness of our proposed algorithm. Our experiments on real and synthetic networks demonstrate that our algorithm is as much as 4X faster compared to computing SSSP with Galois, a state-of-the-art parallel graph analysis software for shared memory architectures. We also demonstrate how increasing the asynchrony can lead to even faster updates. To the best of our knowledge, this is one of the first practical parallel algorithms for updating networks on shared-memory systems, that is also scalable to large networks.

关键词： Heuristic algorithms Software algorithms parallel algorithms Partitioning algorithms Scalability Software Memory architecture

A parallel algorithm for penalized learning of the multivariate exponential family from data of mixed types

学校读者我要写书评

暂无评论

arXiv 2018年

作者： Laman Trip, Diederik S. van Wieringen, Wessel N. Department of Bionanoscience Kavli Institute of Nanoscience Delft University of Technology Van der Maasweg 9 HZ Delft2629 Netherlands Department of Epidemiology and Biostatistics Amsterdam UMC P.O. Box 7057 Amsterdam1007 MB Netherlands Department of Mathematics VU University Amsterdam De Boelelaan 1081a Amsterdam1081 HV Netherlands

Computational efficient evaluation of penalized estimators of multivariate exponential family distributions is sought. These distributions encompass amongst others Markov random fields with variates of mixed type (e.g. binary and continuous) as special case of interest. The model parameter is estimated by maximization of the pseudo-likelihood augmented with a convex penalty. The estimator is shown to be consistent. With a world of multi-core computers in mind, a computationally efficient parallel Newton-Raphson algorithm is presented for numerical evaluation of the estimator alongside conditions for its convergence. parallelization comprises the division of the parameter vector into subvectors that are estimated simultaneously and subsequently aggregated to form an estimate of the original parameter. This approach may also enable efficient numerical evaluation of other high-dimensional estimators. The performance of the proposed estimator and algorithm are evaluated in a simulation study, and the paper concludes with an illustration of the presented methodology in the reconstruction of the conditional independence network from data of an integrative omics study. Copyright © 2018, The Authors. All rights reserved.

关键词： parallel algorithms

PNPFI: An Efficient parallel Frequent Itemsets Mining Algorithm

学校读者我要写书评

暂无评论

PNPFI: An Efficient Parallel Frequent Itemsets Mining Algori...

International Conference on Computer Supported Cooperative Work in Design

作者： Fang Zhang Yu Zhang Xiaofei Liao Hai Jin Services Computing Technology and System Lab Huazhong University of Science and Technology Wuhan China

ISBN: (纸本)9781538614839

Frequent itemsets mining (FIM) plays an important role in many data mining areas. With the explosion of data scale, a number of parallel FIM algorithms have been proposed. Although existing solutions have outstanding scalability, they suffer from high consumption of CPU and memory for recursively mining frequent itemsets based on a tree-structure. In this paper, we propose a novel parallel algorithm, named PNPFI. It employs three novel key optimizations. In detail, the itemsets are stored by the N-list structure, which is more compact than existing tree-based structure. It uses a new structure, called P-Subsume, to generate some frequent itemsets without the process of N-list intersection. In addition, PNPFI proposes a new load balancing strategy, which intelligently divides a large-scale FIM problem into a set of tasks based on the profiled load of each item. Compared with the state-of-the-art algorithms, experimental results show that PNPFI gets a performance improvement of 39% on average (max to 79%), and reduces the memory usage by 58% on average (max to 90%).

关键词： Itemsets Data mining Load management parallel algorithms Memory management Task analysis Complexity theory

A 3D parallel algorithm for QR decomposition

学校读者我要写书评

暂无评论

arXiv 2018年

作者： Ballard, Grey Demmel, James Grigori, Laura Jacquelin, Mathias Knight, Nicholas Wake Forest University Winston SalemNC United States University of California BerkeleyCA United States INRIA Paris-Rocquencourt Paris France Lawrence Berkeley Natl. Lab BerkeleyCA United States NYU–Courant New YorkNY United States

Interprocessor communication often dominates the runtime of large matrix computations. We present a parallel algorithm for computing QR decompositions whose bandwidth cost (communication volume) can be decreased at the cost of increasing its latency cost (number of messages). By varying a parameter to navigate the bandwidth/latency tradeoff, we can tune this algorithm for machines with different communication costs. Copyright © 2018, The Authors. All rights reserved.

关键词： parallel algorithms

Time-parallel iterative solvers for parabolic evolution equations∗

学校读者我要写书评

暂无评论

arXiv 2018年

作者： Neumüller, Martin Smears, Iain Institute of Computational Mathematics Johannes Kepler University Linz Linz4040 Austria Department of Mathematics University College London 25 Gordon Street LondonWC1E 6BT United Kingdom

We present original time-parallel algorithms for the solution of the implicit Euler discretization of general linear parabolic evolution equations with time-dependent self-adjoint spatial operators. Motivated by the inf-sup theory of parabolic problems, we show that the standard nonsymmetric time-global system can be equivalently reformulated as an original symmetric saddle-point system that remains inf-sup stable with respect to the same natural parabolic norms. We then propose and analyse an efficient and readily implementable parallel-in-time preconditioner to be used with an inexact Uzawa method. The proposed preconditioner is non-intrusive and easy to implement in practice, and also features the key theoretical advantages of robust spectral bounds, leading to convergence rates that are independent of the number of time-steps, final time, or spatial mesh sizes, and also a theoretical parallel complexity that grows only logarithmically with respect to the number of time-steps. Numerical experiments with large-scale parallel computations show the effectiveness of the method, along with its good weak and strong scaling properties. Copyright © 2018, The Authors. All rights reserved.

关键词： parallel algorithms

A novel parallel ray-casting algorithm

学校读者我要写书评

暂无评论

arXiv 2018年

作者： Zhang, Yan Gao, Peng Li, Xiao-Qing Shenzhen Graduate School Harbin Institute of Technology China

The Ray-Casting algorithm is an important method for fast real-time surface display from 3D medical images. Based on Ray-Casting algorithm, a novel parallel Ray-Casting algorithm is proposed in this paper. A novel operation is introduced and defined as a star operation, and star operations can be computed in parallel in the proposed algorithm compared with the serial chain of star operations in the Ray-Casting algorithm. The computation complexity of the proposed algorithm is reduced from O(n) to O(log2n). Copyright © 2018, The Authors. All rights reserved.

关键词： parallel algorithms

Efficient distributed-memory parallel matrix-vector multiplication with wide or tall unstructured sparse matrices

学校读者我要写书评

暂无评论

arXiv 2018年

作者： Eckstein, Jonathan Mátyásfalvi, György MSIS Department RUTCOR Rutgers University PiscatawayNJ08854 United States CS and Math Brookhaven National Lab Upton NY11973 United States

—This paper presents an efficient technique for matrix-vector and vector-transpose-matrix multiplication in distributed-memory parallel computing environments, where the matrices are unstructured, sparse, and have a substantially larger number of columns than rows or vice versa. Our method allows for parallel I/O, does not require extensive preprocessing, and has the same communication complexity as matrix-vector multiplies with column or row partitioning. Our implementation of the method uses MPI. We partition the matrix by individual nonzero elements, rather than by row or column, and use an "overlapped" vector representation that is matched to the matrix. The transpose multiplies use matrix-specific MPI communicators and reductions that we show can be set up in an efficient manner. The proposed technique achieves a good work per processor balance even if some of the columns are dense, while keeping communication costs relatively low. Copyright © 2018, The Authors. All rights reserved.

关键词： parallel algorithms

parallel computing as a congestion game

学校读者我要写书评

暂无评论

arXiv 2018年

作者： Malafeyev, O.A. Nemnyugin, S.A. Saint-Petersburg State University Russia

Game-theoretical approach to the analysis of parallel algorithms is proposed. The approach is based on presentation of the parallel computing as a congestion game. In the game processes compete for resources such as core of a central processing unit and a communication subsystem. There are players, resources and payoffs (time delays) of players which depend on resources usage. Comparative analysis of various optimality principles in the proposed model may be performed. Copyright © 2018, The Authors. All rights reserved.

关键词： parallel algorithms

A generic Private Information Retrieval scheme with parallel multi-exponentiations on multicore processors

学校读者我要写书评

暂无评论

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE 2018年第21期30卷

作者： Topcuoglu, Cem Kaya, Kamer Savas, Erkay Sabanci Univ Fac Engn & Nat Sci Istanbul Turkey

Private Information Retrieval (PIR) enables the data owners to share and/or retrieve data on remote repositories without leaking any information as to which a data item is requested. Although it is always possible to download the entire dataset, this is clearly a waste of bandwidth. A fundamental approach in the literature for PIR is exploiting homomorphic cryptosystems. In these approaches, not one but many modular exponentiations need to be computed and multiplied to obtain the desired result. This multi-exponentiation operation can be implemented by exponentiating the bases to their corresponding exponents one-by-one. However, when the operation is considered as a whole, it can be performed in a more efficient way. Although individual exponentiations are pleasingly parallelizable, the combined multi-exponentiation requires a careful parallel implementation. In this work, we propose a generic tensor-based PIR scheme and efficient and novel techniques to parallelize multi-exponentiations on multicore processors with perfect load balance. The experimental results show that our load balancing techniques make a parallel multi-exponentiation up to %27 faster when the size of the bases and the exponents are 4096 bits and the number of threads is 16.

关键词： multicore processors multi-exponentiation parallel algorithms private information retrieval