检索结果-内蒙古大学图书馆

International Forum on Computer and Information Technology (IFCIT)

作者： Li JianFu Civil Aviat Univ China Sch Comp Sci & Technol Tianjin 300300 Peoples R China

ISBN: (纸本)9783038350194

With the arrival of the intermodality era, to design fast and efficient K shortest paths(KSP) algorithms becomes gradually one of the core technologies in traveler information systems. Yen is a classical algorithm for KSP. However, Yen is time-consuming. In view of powerful general-purpose computing capabilities, GPU(Graphics Processing Units) has received increasing attention in various fields. Based on CUDA software development environment, combined with the structure of the Yen algorithm itself, this paper proposed two parallel algorithms for Yen. The first parallel algorithm computes candidate shortest paths for very possible deviation nodes in parallel. The second one computes candidate shortest paths in serial, but computes very candidate path in parallel. Finally, the efficiency of the two parallel algorithms was tested through experiments.

关键词： K Shortest Paths Yen parallel algorithms GPU(Graphics Processing Units)

来源：评论

学校读者我要写书评

暂无评论

Theoretically-Efficient and Practical parallel DBSCAN 20

Theoretically-Efficient and Practical Parallel DBSCAN

引用

ACM SIGMOD International Conference on Management of Data (SIGMOD)

作者： Wang, Yiqiu Gu, Yan Shun, Julian MIT CSAIL Cambridge MA 02139 USA UC Riverside Riverside CA USA

ISBN: (纸本)9781450367356

The DBSCAN method for spatial clustering has received significant attention due to its applicability in a variety of data analysis tasks. There are fast sequential algorithms for DBSCAN in Euclidean space that take O(n logn) work for two dimensions, sub-quadratic work for three or more dimensions, and can be computed approximately in linear work for any constant number of dimensions. However, existing parallel DBSCAN algorithms require quadratic work in the worst case. This paper bridges the gap between theory and practice of parallel DBSCAN by presenting new parallel algorithms for Euclidean exact DBSCAN and approximate DBSCAN that match the work bounds of their sequential counterparts, and are highly parallel (polylogarithmic depth). We present implementations of our algorithms along with optimizations that improve their practical performance. We perform a comprehensive experimental evaluation of our algorithms on a variety of datasets and parameter settings. Our experiments on a 36-core machine with two-way hyper-threading show that our implementations outperform existing parallel implementations by up to several orders of magnitude, and achieve speedups of up to 33x over the best sequential algorithms.

关键词： spatial clustering parallel algorithms DBScan

来源：评论

学校读者我要写书评

暂无评论

MULTILEVEL SECOND-MOMENT METHODS WITH GROUP DECOMPOSITION FOR MULTIGROUP TRANSPORT PROBLEMS

MULTILEVEL SECOND-MOMENT METHODS WITH GROUP DECOMPOSITION FO...

引用

2021 International Conference on Mathematics and Computational Methods Applied to Nuclear Science and Engineering, M and C 2021

作者： Anistratov, Dmitriy Y. Coale, Joseph M. Warsa, James S. Chang, Jae H. Department of Nuclear Engineering North Carolina State University RaleighNC27695-7909 United States Los Alamos National Laboratory Los AlamosNM87545 United States

ISBN: (纸本)9781713886310

This paper presents multilevel iterative schemes for solving the multigroup Boltzmann transport equations (BTEs) with parallel calculation of group equations. They are formulated with multigroup and grey low-order equations of the Second-Moment (SM) method. The group high-order BTEs and low-order SM (LOSM) equations are solved in parallel. To further improve convergence and increase computational efficiency of algorithms Anderson acceleration is applied to inner iterations for solving the system of multigroup LOSM equations. Numerical results are presented to demonstrate performance of the multilevel iterative methods. Copyright © 2021 AMERICAN NUCLEAR SOCIETY, INCORPORATED, LA GRANGE PARK, ILLINOIS *** rights reserved.

关键词： Anderson acceleration Boltzmann equation iterative methods multigroup problems parallel algorithms particle transport

来源：评论

学校读者我要写书评

暂无评论

Depth-First Search in Directed Planar Graphs, Revisited 46

Depth-First Search in Directed Planar Graphs, Revisited

引用

46th International Symposium on Mathematical Foundations of Computer Science, MFCS 2021

作者： Allender, Eric Chauhan, Archit Datta, Samir Rutgers University PiscatawayNJ United States Chennai Mathematical Institute India

ISBN: (纸本)9783959772013

We present an algorithm for constructing a depth-first search tree in planar digraphs;the algorithm can be implemented in the complexity class AC1(UL ∩ co-UL), which is contained in AC2. Prior to this (for more than a quarter-century), the fastest uniform deterministic parallel algorithm for this problem was O(log10 n) (corresponding to the complexity class AC10 ⊆ NC11). We also consider the problem of computing depth-first search trees in other classes of graphs, and obtain additional new upper bounds. © Eric Allender, Archit Chauhan, and Samir Datta;licensed under Creative Commons License CC-BY 4.0 46th International Symposium on Mathematical Foundations of Computer Science (MFCS 2021).

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

parallel Data Partitioning algorithms for Optimization of Data-parallel Applications on Modern Extreme-Scale Multicore Platforms for Performance and Energy

引用

IEEE ACCESS 2018年 6卷 69075-69106页

作者： Manumachu, Ravi Reddy Lastovetsky, Alexey Univ Coll Dublin Sch Comp Sci Dublin D04 V1W8 4 Ireland

Data partitioning algorithms aiming to minimize the execution time and the energy of computations in self-adaptable data-parallel applications on modern extreme-scale multicore platforms must address two critical challenges. First, they must take into account the new complexities inherent in these platforms such as severe resource contention and non-uniform memory access. Second, they must have low practical runtime and memory costs. The sequential data partitioning algorithms addressing the first challenge have a theoretical time complexity of O(m*m*p*p) where m is the number of points in the discrete speed/energy function and p is the number of available processors. They, however, exhibit high practical runtime cost and excessive memory footprint, therefore, rendering them impracticable for employment in self-adaptable applications executing on extreme-scale multicore platforms. We present, in this paper, the parallel data partitioning algorithms that address both the challenges. They take as input the functional models of performance and energy consumption against problem size and output workload distributions, which are globally optimal solutions. They have a low time complexity of O(m*m*p) thereby providing a linear speedup of O(p) and low memory complexity of O(n) where n is the workload size expressed as a multiple of granularity. They employ dynamic programming approach, which also facilitates the easier integration of performance and energy models of communications. We experimentally study the practical cost of application of our algorithms in two data-parallel applications, matrix multiplication and fast Fourier transform, on a cluster in Grid'5000 platform. We demonstrate that their practical runtime and memory costs are low making them ideal for employment in self-adaptable applications. We also show that the parallel algorithms exhibit tremendous speedups over the sequential algorithms. Finally, using theoretical analysis for a forecast exascale platfor

关键词： Data parallelism data partitioning energy energy optimization homogeneous multicore CPU clusters load balancing parallel algorithms performance performance optimization

来源：评论

学校读者我要写书评

暂无评论

parallel Algorithm for Non-Monotone DR-Submodular Maximization 37

Parallel Algorithm for Non-Monotone DR-Submodular Maximizati...

引用

International Conference on Machine Learning (ICML)

作者： Ene, Alina Nguyen, Huy L. Boston Univ Dept Comp Sci 111 Cummington St Boston MA 02215 USA Northeastern Univ Khoury Coll Comp & Informat Sci Boston MA 02115 USA

ISBN: (纸本)9781713821120

In this work, we give a new parallel algorithm for the problem of maximizing a non-monotone diminishing returns submodular function subject to a cardinality constraint. For any desired accuracy epsilon, our algorithm achieves a 1/epsilon - epsilon approximation using O(log n log(1/epsilon)epsilon(3)) parallel rounds of function evaluations. The approximation guarantee nearly matches the best approximation guarantee known for the problem in the sequential setting and the number of parallel rounds is nearly-optimal for any constant epsilon. Previous algorithms achieve worse approximation guarantees using Omega(log(2) n) parallel rounds. Our experimental evaluation suggests that our algorithm obtains solutions whose objective value nearly matches the value obtained by the state of the art sequential algorithms, and it outperforms previous parallel algorithms in number of parallel rounds, iterations, and solution quality.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Wake up and join me! An energy-efficient algorithm for maximal matching in radio networks 35

Wake up and join me! An energy-efficient algorithm for maxim...

引用

35th International Symposium on Distributed Computing, DISC 2021

作者： Dani, Varsha Gupta, Aayush Hayes, Thomas P. Pettie, Seth Department of Computer Science Rochester Institute of Technology NY United States Department of Computer Science University of New Mexico AlbuquerqueNM United States Computer Science and Engineering University of Michigan Ann ArborMI United States

ISBN: (纸本)9783959772105

We consider networks of small, autonomous devices that communicate with each other wirelessly. Minimizing energy usage is an important consideration in designing algorithms for such networks, as battery life is a crucial and limited resource. Working in a model where both sending and listening for messages deplete energy, we consider the problem of finding a maximal matching of the nodes in a radio network of arbitrary and unknown topology. We present a distributed randomized algorithm that produces, with high probability, a maximal matching. The maximum energy cost per node is O(log2 n), and the time complexity is O(∆ log n). Here n is any upper bound on the number of nodes, and ∆ is any upper bound on the maximum degree;n and ∆ are parameters of our algorithm that we assume are known a priori to all the processors. We note that there exist families of graphs for which our bounds on energy cost and time complexity are simultaneously optimal up to polylog factors, so any significant improvement would need additional assumptions about the network topology. We also consider the related problem of assigning, for each node in the network, a neighbor to back up its data in case of eventual node failure. Here, a key goal is to minimize the maximum load, defined as the number of nodes assigned to a single node. We present an efficient decentralized low-energy algorithm that finds a neighbor assignment whose maximum load is at most a polylog(n) factor bigger that the optimum. © Varsha Dani, Aayush Gupta, Thomas P. Hayes, and Seth Pettie;licensed under Creative Commons License CC-BY 4.0

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Distributed-Memory parallel Symmetric Nonnegative Matrix Factorization

Distributed-Memory Parallel Symmetric Nonnegative Matrix Fac...

引用

International Conference on High Performance Computing, Networking, Storage and Analysis (SC)

作者： Eswar, Srinivas Hayashi, Koby Ballard, Grey Kannan, Ramakrishnan Vuduc, Richard Park, Haesun Georgia Inst Technol Dept Computat Sci & Engn Atlanta GA 30332 USA Wake Forest Univ Dept Comp Sci Winston Salem NC 27101 USA Oak Ridge Natl Lab Computat Data Analyt Grp Oak Ridge TN USA

ISBN: (纸本)9781728199986

We develop the first distributed -memory parallel implementation of Symmetric Nonnegative Matrix Factorization (SymNMF), a key data analytics kernel 14 clustering and dimensionality reduction. Our implementation includes two different algorithms for SytnNMF, which give comparable results in terms of time and accuracy. The first algorithm is a parallelization of an existing sequential approach that uses solvers for nonsymmetric NNW The second algorithm is a novel approach based on the Gauss -Newton method. It exploits second -order information without incurring large computational and memory costs. We evaluate the scalability of our algorithms on the Summit system at Oak Ridge National Laboratory, scaling up to 128 nodes (4,096 cores) with 70% efficiency. Additionally, we demonstrate our software on an image segmentation task.

关键词： High performance computing Newton method parallel algorithms Symmetric Matrices

来源：评论

学校读者我要写书评

暂无评论

parallel two-stage algorithms for solving the PageRank problem

引用

ADVANCES IN ENGINEERING SOFTWARE 2018年 125卷 188-199页

作者： Migallon, Hector Migallon, Violeta Penades, Jose Univ Miguel Hernandez Dept Phys & Comp Architectures E-03202 Elche Alicante Spain Univ Alicante Dept Comp Sci & Artificial Intelligence E-03071 Alicante Spain

In this work we present parallel algorithms based on the use of two-stage methods for solving the PageRank problem as a linear system. Different parallel versions of these methods are explored and their convergence properties are analyzed. The parallel implementation has been developed using a mixed MPI/OpenMP model to exploit parallelism beyond a single level. In order to investigate and analyze the proposed parallel algorithms, we have used several realistic large datasets. The numerical results show that the proposed algorithms can speed up the time to converge with respect to the parallel Power algorithm and behave better than other well-known techniques.

关键词： PageRank parallel algorithms Two-stage methods Shared memory Distributed memory

来源：评论

学校读者我要写书评

暂无评论

A Lower Bound for parallel Submodular Minimization 2020

A Lower Bound for Parallel Submodular Minimization

引用

52nd Annual ACM SIGACT Symposium on Theory of Computing (STOC)

作者： Balkanski, Eric Singer, Yaron Harvard Univ Cambridge MA 02138 USA

ISBN: (纸本)9781450369794

In this paper, we study submodular function minimization in the adaptive complexity model. Seminal work by Grotschel, Lovasz, and Schrijver shows that with oracle access to a function f, the problem of submodular minimization can be solved exactly with poly(n) queries to f. A long line of work has since then been dedicated to the acceleration of submodular minimization. In particular, recent work obtains a (strongly) polynomial time algorithm with (O) over tilde (n(3)) query complexity. A natural way to accelerate computation is via parallelization, though very little is known about the extent to which submodular minimization can be parallelized. A natural measure for the parallel runtime of a black-box optimization algorithm is its adaptivity, as recently introduced in the context of submodular maximization. Informally, the adaptivity of an algorithm is the number of sequential rounds it makes when each round can execute polynomially-many function evaluations in parallel. In the past two years there have been breakthroughs in the study of adaptivity for both submodular maximization and convex minimization, in particular an exponential improvement in the parallel running time of submodular maximization was obtained with a O(logn)-adaptive algorithm. Whether submodular minimization can enjoy, thanks to parallelization, the same dramatic speedups as submodular maximization is unknown. To date, we do not know of any polynomial time algorithm for solving submodular minimization whose adaptivity is subquadratic in n. We initiate the study of the adaptivity of submodular function minimization by giving the first non-trivial lower bound for the parallel runtime of submodular minimization. We show that there is no o(log n/log log n)-adaptive algorithm with poly(n) queries which solves the problem of submodular minimization. This is the first adaptivity lower bound for unconstrained submodular optimization (whether for maximization or minimization) and the analysis relies on

关键词： Adaptivity submodular minimization parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：