检索结果-内蒙古大学图书馆

Coupling parallel adaptive mesh refinement with a nonoverlapping domain decomposition solver

ADVANCES IN ENGINEERING SOFTWARE 2017年 110卷 34-54页

作者： Kus, Pavel Sistek, Jakub Czech Acad Sci Inst Math Zitna 25 Prague 11567 Czech Republic Max Planck Inst Max Planck Comp & Data Facil Giessenbachstr 2 D-85748 Garching Germany Univ Manchester Sch Math Manchester M13 9PL Lancs England

We study the effect of adaptive mesh refinement on a parallel domain decomposition solver of a linear system of algebraic equations. These concepts need to be combined within a parallel adaptive finite element software. A prototype implementation is presented for this purpose. It uses adaptive mesh refinement with one level of hanging nodes. Two and three-level versions of the Balancing Domain Decomposition based on Constraints (BDDC) method are used to solve the arising system of algebraic equations. The basic concepts are recalled and components necessary for the combination are studied in detail. Of particular interest is the effect of disconnected subdomains, a typical output of the employed mesh partitioning based on space-filling curves, on the convergence and solution time of the BDDC method. It is demonstrated using a large set of experiments that while both refined meshes and disconnected subdomains have a negative effect on the convergence of BDDC, the number of iterations remains acceptable. In addition, scalability of the three-level BDDC solver remains good on up to a few thousands of processor cores. The largest presented problem using adaptive mesh refinement has over 109 unknowns and is solved on 2048 cores. (C) 2017 Elsevier Ltd. All rights reserved.

关键词： Adaptive mesh refinement parallel algorithms Domain decomposition BDDC AMR

来源：评论

学校读者我要写书评

暂无评论

Efficient parallel optimization of volume meshes on heterogeneous computing systems

引用

ENGINEERING WITH COMPUTERS 2017年第4期33卷 717-726页

作者： Cheng, Zuofu Shaffer, Eric Yeh, Raine Zagaris, George Olson, Luke Univ Illinois Urbana IL 61801 USA Purdue Univ W Lafayette IN 47907 USA Univ Illinois Kitware Inc Urbana IL 61801 USA

We describe a parallel algorithmic framework for optimizing the shape of elements in a simplicial volume mesh. Using fine-grained parallelism and asymmetric multiprocessing on multi-core CPU and modern graphics processing unit hardware simultaneously, we achieve speedups of more than tenfold over current state-of-the-art serial methods. In addition, improved mesh quality is obtained by optimizing both the surface and the interior vertex positions in a single pass, using feature preservation to maintain fidelity to the original mesh geometry. The framework is flexible in terms of the core numerical optimization method employed, and we provide performance results for both gradient-based and derivative-free optimization methods.

关键词： Mesh optimization parallel algorithms GPU applications

来源：评论

学校读者我要写书评

暂无评论

A parallel algorithm for the parametric synthesis of a system for the angular stabilization of a rotating elastic beam under the action of longitudinal acceleration

引用

JOURNAL OF COMPUTER AND SYSTEMS SCIENCES INTERNATIONAL 2017年第2期56卷 192-207页

作者： Andreichenko, D. K. Andreichenko, K. P. Kononov, V. V. Saratov NG Chernyshevskii State Univ Saratov 410012 Russia Saratov State Tech Univ Saratov 410054 Russia

A parallel algorithm for the parametric synthesis of a family of controlled linearized combined dynamic systems in which the configuration of the stability regions in the feedback parameter space depends on some slowly varying design parameters is proposed. The parametric synthesis of a system for the angular stabilization of an elastic beam rotating about a longitudinal axis under the action of a longitudinal acceleration is implemented. This kind of synthesis can substantially reduce the errors of the stabilization system and the typical regulation time in the entire range of the slow increase in the rotation velocity of the beam.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Distributed memory parallel approaches for HEVC encoder

引用

JOURNAL OF SUPERCOMPUTING 2017年第1期73卷 164-175页

作者： Migallon, H. Galiano, V. Pinol, P. Lopez-Granado, O. Malumbres, M. P. Miguel Hernandez Univ Phys & Comp Architecture Dept Elche 03202 Spain

The HEVC video coding standard launched on 2013, is able to reduce to the half, on average, the bit stream size produced by H.264/AVC encoder at the same video quality, but it requires nearly 70 % more time than H.264/AVC to encode a video sequence. In this paper we propose several parallelization approaches to the HEVC encoder. Our proposals, for distributed memory platforms, work at a coarse grain level parallelization, being one group of pictures (GOP) the basic structure. These approaches encode simultaneously several GOPs. To obtain good parallel performance, a right GOP conformation and distribution should be applied.

关键词： parallel algorithms Video coding HEVC Performance Distributed memory Message passing

来源：评论

学校读者我要写书评

暂无评论

An Adaptive parallel Algorithm for Computing Connected Components

引用

IEEE TRANSACTIONS ON parallel AND DISTRIBUTED SYSTEMS 2017年第9期28卷 2428-2439页

作者： Jain, Chirag Flick, Patrick Pan, Tony Green, Oded Aluru, Srinivas Georgia Inst Technol Atlanta GA 30332 USA

We present an efficient distributed memory parallel algorithm for computing connected components in undirected graphs based on Shiloach-Vishkin's PRAM approach. We discuss multiple optimization techniques that reduce communication volume as well as load-balance the algorithm. We also note that the efficiency of the parallel graph connectivity algorithm depends on the underlying graph topology. Particularly for short diameter graph components, we observe that parallel Breadth First Search (BFS) method offers better performance. However, running parallel BFS is not efficient for computing large diameter components or large number of small components. To address this challenge, we employ a heuristic that allows the algorithm to quickly predict the type of the network by computing the degree distribution and follow the optimal hybrid route. Using large graphs with diverse topologies from domains including metagenomics, web crawl, social graph and road networks, we show that our hybrid implementation is efficient and scalable for each of the graph types. Our approach achieves a runtime of 215 seconds using 32 K cores of Cray XC30 for a metagenomic graph with over 50 billion edges. When compared against the previous state-of-the-art method, we see performance improvements up to 24 x.

关键词： parallel algorithms distributed memory breadth first search undirected graphs

来源：评论

学校读者我要写书评

暂无评论

Performance analysis of frame partitioning in parallel HEVC encoders

引用

JOURNAL OF SUPERCOMPUTING 2017年第1期73卷 543-556页

作者： Migallon, H. Pinol, P. Lopez-Granado, O. Galiano, V. Malumbres, M. P. Miguel Hernandez Univ Phys & Comp Architecture Dept Elche 03202 Spain

The new video coding standard HEVC includes two concepts that allow to partition a frame into regions that can be independently encoded and decoded. These two concepts are named "Tiles" and "Slices". In this paper, we present and analyze optimized parallel versions of the HEVC encoder based on tile and slice partitioning. We have evaluated the benefits and drawbacks of both approaches in terms of computational times and rate distortion performance. The results show that both approaches obtain good speed-ups, being the parallel version based on tiles the one that obtains the best trade-off between speed-up achieved (up to 9.3) and rate distortion performance loss (1.6% BD rate for AI mode and 2.2% for LB mode on average).

关键词： Tiles HEVC Video coding parallel algorithms Multicore Performance

来源：评论

学校读者我要写书评

暂无评论

Domain decomposition approach for parallel improvement of tetrahedral meshes

引用

JOURNAL OF parallel AND DISTRIBUTED COMPUTING 2017年 107卷 101-113页

作者： Chen, Jianjun Zhao, Dawei Zheng, Yao Xu, Yan Li, Chenfeng Zheng, Jianjing Zhejiang Univ Ctr Engn & Sci Computat Hangzhou 310027 Zhejiang Peoples R China Zhejiang Univ Sch Aeronaut & Astronaut Hangzhou 310027 Zhejiang Peoples R China Swansea Univ Zienkiewicz Ctr Computat Engn Swansea SA2 8PP W Glam Wales

Presently, a tetrahedral mesher based on the Delaunay triangulation approach may outperform a tetrahedral improver based on local smoothing and flip operations by nearly one order in terms of computing time. parallelization is a feasible way to speed up the improver and enable it to handle large-scale meshes. In this study, a novel domain decomposition approach is proposed for parallel mesh improvement. It analyses the dual graph of the input mesh to build an inter-domain boundary that avoids small dihedral angles and poorly shaped faces. Consequently, the parallel improver can fit this boundary without compromising the mesh quality. Meanwhile, the new method does not involve any inter-processor communications and therefore runs very efficiently. A parallel pre-processing pipeline that combines the proposed improver and existing parallel surface and volume meshers can prepare a quality mesh containing hundreds of millions of elements in minutes. Experiments are presented to show that the developed system is robust and applicable to models of a complication level experienced in industry. (C) 2017 Elsevier Inc. All rights reserved.

关键词： parallel algorithms Mesh generation Quality improvement Domain decomposition Dual graph

来源：评论

学校读者我要写书评

暂无评论

Computing Maximum Cardinality Matchings in parallel on Bipartite Graphs via Tree-Grafting

引用

IEEE TRANSACTIONS ON parallel AND DISTRIBUTED SYSTEMS 2017年第1期28卷 44-59页

作者： Azad, Ariful Buluc, Aydin Pothen, Alex Lawrence Berkeley Natl Lab Computat Res Div Berkeley CA 94720 USA Purdue Univ Dept Comp Sci W Lafayette IN 47907 USA

It is difficult to obtain high performance when computing matchings on parallel processors because matching algorithms explicitly or implicitly search for paths in the graph, and when these paths become long, there is little concurrency. In spite of this limitation, we present a new algorithm and its shared-memory parallelization that achieves good performance and scalability in computing maximum cardinality matchings in bipartite graphs. Our algorithm searches for augmenting paths via specialized breadth-first searches (BFS) from multiple source vertices, hence creating more parallelism than single source algorithms. algorithms that employ multiple-source searches cannot discard a search tree once no augmenting path is discovered from the tree, unlike algorithms that rely on single-source searches. We describe a novel tree-grafting method that eliminates most of the redundant edge traversals resulting from this property of multiple-source searches. We also employ the recent direction-optimizing BFS algorithm as a subroutine to discover augmenting paths faster. Our algorithm compares favorably with the current best algorithms in terms of the number of edges traversed, the average augmenting path length, and the number of iterations. We provide a proof of correctness for our algorithm. Our NUMA-aware implementation is scalable to 80 threads of an Intel multiprocessor and to 240 threads on an Intel Knights Corner coprocessor. On average, our parallel algorithm runs an order of magnitude faster than the fastest algorithms available. The performance improvement is more significant on graphs with small matching number.

关键词： Cardinality matching bipartite graph tree grafting parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Concurrent computation of topological watershed on shared memory parallel machines

引用

parallel COMPUTING 2017年 69卷 78-97页

作者： Mahmoudi, Ramzi Akil, Mohamed Bedoui, Mohamed Hedi Univ Paris Est Lab Informat Gaspard Monge Equipe A3SI ESIEE Paris Cite Descartes BP99 F-93162 Noisy Le Grand France Univ Monastir Lab Technol & Imagerie Med Fac Med Monastir Rue Ibn Sina Monastir 5019 Tunisia

The watershed transform is considered as the most appropriate method for image segmentation in the field of mathematical morphology. In the following paper, we present an adapted topological watershed algorithm suited for a rapid and effective implementation on Shared Memory parallel Machine (SMPM). The introduced algorithm allows a parallel watershed computing while preserving the given topology. No prior minima extraction is needed, nor the use of any sorting step or hierarchical queue. The strategy that guides the parallel watershed computing, labeled SDM-Strategy (equivalent to Split-Distributes and Merge), is also presented. Experimental analyses such as execution time, performance enhancement, cache consumption, efficiency and scalability are also presented and discussed. (C) 2017 Elsevier B.V. All rights reserved.

关键词： Watershed transform parallel computing Image processing parallelization strategy Computing methodologies Enhancement parallel algorithms Mathematical morphology

来源：评论

学校读者我要写书评

暂无评论

Distributed coverage hole detection algorithm based on Čech complex 12th

Distributed coverage hole detection algorithm based on Čech...

引用

12th International Conference on Communications and Networking in China, CHINACOM 2017

作者： Yuchen, Wang Jialiang, Lu Martins, Philippe SJTU-ParisTech Elite Institute of Technology Shanghai China Institut Telecom TELECOM ParisTech LTCI Paris France

ISBN: (纸本)9783319781389

Coverage problem is essential to Wireless Sensor Networks on energy efficient deployment and monitoring. In this paper, we propose a distributed Čech complex algorithm for coverage hole detection in WSNs. Based on our algorithm, each node takes only local information to build Čech sub-complex. Simulations on randomly deployed nodes show that the algorithm achieves a comparable accuracy and a much lower communication cost than a centralized Čech complex construction. Furthermore, it can be combined with distributed Rips complex algorithm to gain an even better performance. © ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2018.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：