检索结果-内蒙古大学图书馆

52nd Annual ACM SIGACT Symposium on Theory of Computing (STOC)

作者： Andoni, Alexandr Stein, Clifford Zhong, Peilin Columbia Univ New York NY 10027 USA

ISBN: (纸本)9781450369794

We present a (1 + epsilon) -approximate parallel algorithm for computing shortest paths in undirected graphs, achieving poly(log n) depth and mpoly(log n) work for n-nodes m-edges graphs. Although sequential algorithms with (nearly) optimal running time have been known for several decades, near-optimal parallel algorithms have turned out to be a much tougher challenge. For (1 + epsilon) -approximation, all prior algorithms with poly(log n) depth perform at least Omega(mn(c)) work for some constant c > 0. Improving this long-standing upper bound obtained by Cohen (STOC'94) has been open for 25 years. We develop several new tools of independent interest. One of them is a new notion beyond hopsets - low hop emulator - a poly(log n)-approximate emulator graph in which every shortest path has at most O(log log n) hops (edges). Direct applications of the low hop emulators are parallel algorithms for poly(log n)-approximate single source shortest path (SSSP), Bourgain's embedding, metric tree embedding, and low diameter decomposition, all with poly(log n) depth and mpoly(log n) work. To boost the approximation ratio to (1 + epsilon), we introduce compressible preconditioners and apply it inside Sherman's framework (SODA'17) to solve the more general problem of uncapacitated minimum cost flow (a.k.a., transshipment problem). Our algorithm computes a (1 + epsilon)-approximate uncapacitated minimum cost flow in poly(log n) depth using mpoly(log n) work. As a consequence, it also improves the state-of-the-art sequential running time from m . 2(O(root log n)) to mpoly(log n).

关键词： parallel algorithms shortest paths minimum cost flow low hop emulators

来源：评论

学校读者我要写书评

暂无评论

parallel Peeling algorithms 14

Parallel Peeling Algorithms

引用

26th ACM Symposium on parallelism in algorithms and Architectures (SPAA)

作者： Jiang, Jiayang Mitzenmacher, Michael Thaler, Justin Harvard Univ Sch Engn & Appl Sci Cambridge MA 02138 USA Univ Calif Berkeley Simons Inst Theory Comp Berkeley CA USA

ISBN: (纸本)9781450328210

The analysis of several algorithms and data structures can be framed as a peeling process on a random hypergraph: vertices with degree less than k are removed until there are no vertices of degree less than k left. The remaining hypergraph is known as the k-core. In this paper, we analyze parallel peeling processes, where in each round, all vertices of degree less than k are removed. It is known that, below a specific edge density threshold, the k-core is empty with high probability. We show that, with high probability, below this threshold, only 1/log ((k-1)(r-1)) log logn + O(1) rounds of peeling are needed to obtain the empty k-core for r-uniform hypergraphs. Interestingly, we show that above this threshold, Omega(logn) rounds of peeling are required to find the non-empty k-core. Since most algorithms and data structures aim to peel to an empty kcore, this asymmetry appears fortunate. We verify the theoretical results both with simulation and with a parallel implementation using graphics processing units (GPUs). Our implementation provides insights into how to structure parallel peeling algorithms for efficiency in practice.

关键词： parallel algorithms peeling algorithms gpu implementations invertible bloom lookup tables random hypergraphs

来源：评论

学校读者我要写书评

暂无评论

An Efficient and parallel Electromagnetic Solver for Complex Interconnects in Layered Media 29

An Efficient and Parallel Electromagnetic Solver for Complex...

引用

IEEE 29th Conference on Electrical Performance of Electronic Packaging and Systems (EPEPS)

作者： Marek, Damian Sharma, Shashwat Triverio, Piero Univ Toronto Edward S Rogers Sr Dept Elect & Comp Engn Toronto ON Canada

ISBN: (纸本)9781728161617

A novel parallel solver based on the adaptive integral method (AIM) is proposed for the electromagnetic analysis of electrical interconnects in layered media. We show that graph partitioning techniques can be used to optimally distribute, across thousands of processes, the computations related to both matrix filling and system solution. The proposed workload distribution strategy is compared to existing techniques through a scalability study on a large realistic interposer model in layered media.

关键词： surface integral equation method adaptive integral method parallel algorithms skin effect modeling

来源：评论

学校读者我要写书评

暂无评论

parallel Planar Subgraph Isomorphism and Vertex Connectivity 20

Parallel Planar Subgraph Isomorphism and Vertex Connectivity

引用

32nd ACM Symposium on parallelism in algorithms and Architectures (SPAA)

作者： Gianinazzi, Lukas Hoefler, Torsten Swiss Fed Inst Technol Dept Comp Sci Zurich Switzerland

ISBN: (纸本)9781450369350

We present the first parallel fixed-parameter algorithm for subgraph isomorphism in planar graphs, bounded-genus graphs, and, more generally, all minor-closed graphs of locally bounded treewidth. Our randomized low depth algorithm has a near-linear work dependency on the size of the target graph. Existing low depth algorithms do not guarantee that the work remains asymptotically the same for any constant-sized pattern. By using a connection to certain separating cycles, our subgraph isomorphism algorithm can decide the vertex connectivity of a planar graph (with high probability) in asymptotically near-linear work and poly-logarithmic depth. Previously, no sub-quadratic work and poly-logarithmic depth bound was known in planar graphs (in particular for distinguishing between four-connected and five-connected planar graphs).

关键词： graph algorithms parallel algorithms subgraph isomorphism planar graphs vertex connectivity parameterized complexity

来源：评论

学校读者我要写书评

暂无评论

An Algorithm for the Sequence Alignment with Gap Penalty Problem using Multiway Divide-and-Conquer and Matrix Transposition

引用

INFORMATION PROCESSING LETTERS 2022年 173卷

作者： Shubham Prakash, Surya Ganapathi, Pramod Indian Inst Technol Indore Discipline Comp Sci & Engn Indore India SUNY Stony Brook Dept Comp Sci Stony Brook NY 11794 USA

We present a cache-efficient parallel algorithm for the sequence alignment with gap penalty problem for shared-memory machines using multiway divide-and-conquer and not-in-place matrix transposition. Our r-way divide-and-conquer algorithm, for a fixed natural number r >= 2, performs Theta (n(3)) work, achieves Theta (n(logr(2r-1))) span, and incurs O(n(3)/(BM) + (n(2)/B)log root M) serial cache misses for n > gamma M, and incurs O ((n(2)/B)log(n/root M)) serial cache misses for alpha root M < n <= gamma M, where, M is the cache size, B is the cache line size, and alpha and gamma are constants. Published by Elsevier B.V.

关键词： Sequence alignment parallel algorithms Multiway divide-and-conquer Dynamic programming Cache-efficient

来源：评论

学校读者我要写书评

暂无评论

parallel Numerical algorithms for Simulation of Rectangular Waveguides by Using GPU 1

引用

10th International Conference on parallel Processing and Applied Mathematics (PPAM)

作者： Ciegis, Raimondas Bugajev, Andrej Kancleris, Zilvinas Slekas, Gediminas Vilnius Gediminas Tech Univ LT-10223 Vilnius Lithuania

ISBN: (数字)9783642551956

ISBN: (纸本)9783642551956

In this article we consider parallel numerical algorithms to solve the 3D mathematical model, that describes a wave propagation in rectangular waveguide. The main goal is to formulate and analyze a minimal algorithmic template to solve this problem by using the CUDA platform. This template is based on explicit finite difference schemes obtained after approximation of systems of differential equations on the staggered grid. The parallelization of the discrete algorithm is based on the domain decomposition method. The theoretical complexity model is derived and the scalability of the parallel algorithm is investigated. Results of numerical simulations are presented.

关键词： parallel algorithms Numerical simulation Wave propagation GPU CUDA Scalability analysis

来源：评论

学校读者我要写书评

暂无评论

Design of Longitudinal Anti-Disturbance Control System for Aircraft Based on Distributed parallel Algorithm 5

Design of Longitudinal Anti-Disturbance Control System for A...

引用

5th IEEE International Conference on Advanced Robotics and Mechatronics (ICARM)

作者： Lang, Pengfei Liu, Zun Ge, Meng China Acad Launch Vehicle Technol Beijing Peoples R China Shenzhen Univ Coll Comp Sci & Software Engn Shenzhen Peoples R China Beijing Aerosp Inst Metrol & Measurement Technol Beijing Peoples R China

ISBN: (数字)9781728164793

ISBN: (纸本)9781728164793

Aiming at the problem of poor control stability of traditional aircraft control systems, the longitudinal anti-disturbance control system based on the distributed parallel algorithm was designed. Based on the hardware of the original control system, the anti-disturbance control system was designed. And the software part of the aircraft longitudinal anti-disturbance control system was designed. The longitudinal model of aircraft was established, and the active disturbance rejection controller (ADRC) was also designed according to the model. Through the use of distributed parallel algorithms to set the parameters of ADRC, thus completing the design of the vertical ADRC system The comparison experiment with the traditional PD-based aircraft control system shows that the design of the control system based on distributed parallel algorithm has the characteristics of less overshoot, good stability and broad application prospects.

关键词： Control systems Aerospace control Aircraft parallel algorithms Atmospheric modeling Stability analysis Hardware

来源：评论

学校读者我要写书评

暂无评论

parallel and Scalable Precise Clustering 20

Parallel and Scalable Precise Clustering

引用

ACM International Conference on parallel Architectures and Compilation Techniques (PACT)

作者： Byma, Stuart Dhasade, Akash Altenhoff, Adrian Dessimoz, Christophe Larus, James R. Ecole Polytech Fed Lausanne Lausanne Switzerland IIT Tirupati Tirupati Andhra Pradesh India Swiss Fed Inst Technol Zurich Switzerland Univ Lausanne Lausanne Switzerland

ISBN: (纸本)9781450380751

This paper describes a new technique for parallelizing protein clustering, an important bioinformatics computation for the analysis of protein sequences. Protein clustering identifies groups of proteins that are similar because they share long sequences of similar amino acids. Given a collection of protein sequences, clustering can significantly reduce the computational effort required to identify all similar sequences by avoiding many negative comparisons. The challenge, however, is to build a clustering that misses as few similar sequences (or elements, more generally) as possible. In this paper, we introduce precise clustering, a property that requires each pair of similar elements to appear together in at least one cluster. We show that transitivity in the data can be leveraged to merge clusters while maintaining a precise clustering, providing a basis for independently forming clusters. This allows us reformulate clustering as a bottom-up merge of independent clusters in a new algorithm called ClusterMerge. ClusterMerge exposes parallelism, enabling fast and scalable implementations. We apply ClusterMerge to find similar amino acid sequences in a collection of proteins. ClusterMerge identifies 99.8% of similar pairs found by a full O(n(2)) comparison, with only half as many comparisons. More importantly, ClusterMerge is highly amenable to parallel and distributed computation. Our implementation achieves a speedup of 604 times on 768 cores (1400 times faster than a comparable single-threaded clustering implementation), a strong scaling efficiency of 90%, and a weak scaling efficiency of nearly 100%.

关键词： bioinformatics protein clustering parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Faster parallel Algorithm for Approximate Shortest Path 2020

Faster Parallel Algorithm for Approximate Shortest Path

引用

52nd Annual ACM SIGACT Symposium on Theory of Computing (STOC)

作者： Li, Jason Carnegie Mellon Univ Pittsburgh PA 15213 USA

ISBN: (纸本)9781450369794

We present the firstm polylog(n) work, polylog(n) time algorithm in the PRAM model that computes (1 + epsilon)-approximate single-source shortest paths on weighted, undirected graphs. This improves upon the breakthrough result of Cohen [JACM'00] that achieves O(m(1+epsilon 0)) work and polylog(n) time. While most previous approaches, including Cohen's, leveraged the power of hopsets, our algorithm builds upon the recent developments in continuous optimization, studying the shortest path problem from the lens of the closely-related minimum transshipment problem. To obtain our algorithm, we demonstrate a series of near-linearwork, polylogarithmic-time reductions between the problems of approximate shortest path, approximate transshipment, and l(1)-embeddings, and establish a recursive algorithm that cycles through the three problems and reduces the graph size on each cycle. As a consequence, we also obtain faster parallel algorithms for approximate transshipment and l(1)-embeddings with polylogarithmic distortion. The minimum transshipment algorithm in particular improves upon the previous best m(1+o(1)) work sequential algorithm of Sherman [SODA'17]. To improve readability, the paper is almost entirely self-contained, save for several staple theorems in algorithms and combinatorics.

关键词： parallel algorithms Shortest Path Minimum Transshipment

来源：评论

学校读者我要写书评

暂无评论

GPU parallel Computation of Morse-Smale Complexes

GPU Parallel Computation of Morse-Smale Complexes

引用

IEEE Visualization Conference (VIS)

作者： Subhash, Varshini Pandey, Karran Natarajan, Vijay Indian Inst Sci Dept Comp Sci & Automat Bangalore Karnataka India

ISBN: (纸本)9781728180144

The Morse-Smale complex is a well studied topological structure that represents the gradient flow behavior of a scalar function. It supports multi-scale topological analysis and visualization of large scientific data. Its computation poses significant algorithmic challenges when considering large scale data and increased feature complexity. Several parallel algorithms have been proposed towards the fast computation of the 3D Morse-Smale complex. The non-trivial structure of the saddle-saddle connections are not amenable to parallel computation. This paper describes a fine grained parallel method for computing the Morse-Smale complex that is implemented on a GPU. The saddle-saddle reachability is first determined via a transformation into a sequence of vector operations followed by the path traversal, which is achieved via a sequence of matrix operations. Computational experiments show that the method achieves up to 7 x speedup over current shared memory implementations.

关键词： Human-centered computing Visualization Visualization techniques Computing methodologies parallel computing methodologies parallel algorithms Shared memory algorithms

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：