检索结果-内蒙古大学图书馆

A robust Delaunay-AFT based parallel method for the generation of large-scale fully constrained meshes

COMPUTERS & STRUCTURES 2020年 228卷 1页

作者： Yu, Fei Zeng, Yan Guan, Z. Q. Lo, S. H. Dalian Univ Technol Dept Engn Mech State Key Lab Struct Anal Ind Equipment Dalian 116024 Peoples R China Univ Hong Kong Dept Civil Engn Pokfulam Rd Hong Kong Peoples R China

Making full use of a sequential Delaunay-AFT mesher, a parallel method for the generation of large-scale tetrahedral meshes on distributed-memory machines is developed. To generate meshes with the required and the preserved properties, a Delaunay-AFT based domain decomposition (DD) technique is employed. Starting from the Delaunay triangulation (DT) covering the problem domain, this technique creates a layer of elements dividing the domain into several zones. The initially coarsely meshed domain is partitioned into DTs of subdomains which can be meshed in parallel. When the size of a subdomain is smaller than a user-specified threshold, it will be meshed with the standard Delaunay-AFT mesher. A two-level DD strategy is designed to improve the parallel efficiency of this algorithm. A dynamic load balancing scheme is also implemented using the Message Passing Interface (MPI). Out-of-core meshing is introduced to accommodate excessive large meshes that cannot be handled by the available memory of the computer (RAM). Numerical tests are performed for various complex geometries with thousands of surface patches. Ultra-large-scale meshes with more than ten billion tetrahedral elements have been created. Moreover, the meshes generated with different numbers of DD operations are nearly identical in quality: showing the consistency and the stability of the automatic decomposition algorithm. (C) 2019 Elsevier Ltd. All rights reserved.

关键词： Finite element mesh generation parallel algorithms Domain decomposition Delaunay triangulations Delaunay-AFT Out-of-core

来源：评论

学校读者我要写书评

暂无评论

Solving Black-Scholes Equation Based on Time Domain Decomposition and Meshless Method

Solving Black-Scholes Equation Based on Time Domain Decompos...

引用

2022 International Joint Conference on Information and Communication Engineering, JCICE 2022

作者： Duan, Yong Zhu, Dongyuan School of Mathematical Sciences University of Electronic Science and Technology Chengdu China

ISBN: (数字)9781665460675

ISBN: (纸本)9781665460675

The work of this paper is to solve the Black-Scholes equation under European options based on the time parallel algorithm combined with the kansa method. Firstly, the partial differential equation of the price of derivative products based on stock price is obtained by using efficient market theory, no-arbitrage principle and ITO theorem. Then, the general heat conduction equation is solved by time domain decomposition coupled meshless method. Finally, through numerical example verify that this computational format has high accuracy and validity. © 2022 IEEE.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

A GENERIC FINITE ELEMENT FRAMEWORK ON parallel TREE-BASED ADAPTIVE MESHES

引用

SIAM JOURNAL ON SCIENTIFIC COMPUTING 2020年第6期42卷 C436-C468页

作者： Badia, Santiago Martin, Alberto F. Neiva, Eric Verdugo, Francesc Monash Univ Sch Math Sci Clayton Vic 3800 Australia UPC CIMNE Ctr Int Metodes Numer Engn Castelldefels 08860 Spain Univ Politecn Cataluna Dept Civil & Environm Engn ES-08034 Barcelona Spain

In this work we formally derive and prove the correctness of the algorithms and data structures in a parallel, distributed-memory, generic finite element framework that supports h-adaptivity on computational domains represented as forest-of-trees. The framework is grounded on a rich representation of the adaptive mesh suitable for generic finite elements that is built on top of a low-level, light-weight forest-of-trees data structure handled by a specialized, highly parallel adaptive meshing engine, for which we have identified the requirements it must fulfill to be coupled into our framework. Atop this two-layered mesh representation, we build the rest of the data structures required for the numerical integration and assembly of the discrete system of linear equations. We consider algorithms that are suitable for both subassembled and fully assembled distributed data layouts of linear system matrices. The proposed framework has been implemented within the FEMPAR scientific software library, using p4est as a practical forest-of-octrees demonstrator. A strong scaling study of this implementation when applied to Poisson and Maxwell problems reveals remarkable scalability up to 32.2K CPU cores and 482.2M degrees of freedom. Besides, a comparative performance study of FEMPAR and the state-of-the-art deal. II finite element software shows at least comparative performance, and at most a factor of 2-3 improvement in the h-adaptive approximation of a Poisson problem with first- and second-order Lagrangian finite elements, respectively.

关键词： partial differential equations finite elements adaptive mesh refinement forest of trees parallel algorithms scientific software

来源：评论

学校读者我要写书评

暂无评论

Solving tridiagonal Toeplitz systems of linear equations on GPU-accelerated computers

Solving tridiagonal Toeplitz systems of linear equations on ...

引用

作者： Dmitruk, Beata Stpiczyński, Przemyslaw Institute of Computer Science Maria Curie-Sklodowska University Akademicka 9 Lublin20-033 Poland

The aim of this article is to show that solvers for tridiagonal Toeplitz systems of linear equations can be efficiently implemented for a variety of modern GPU-accelerated and multicore architectures using OpenACC. We consider two parallel algorithms for solving such systems with special assumptions about coefficient matrices. As the first algorithm, we propose a new, faster implementation of the divide and conquer method. The next algorithm is a new, vectorizable algorithm based on a recently introduced sequential method. We consider the use of both column-wise and row-wise storage formats for two-dimensional arrays and show how to efficiently convert between these two formats using cache memory and improve the overall performance of our implementations. We also show how to tune the performance by predicting the best values of the methods' parameters. Numerical experiments performed on Intel CPUs and Nvidia GPUs show that our new implementations achieve relatively good performance and accuracy. © 2021 John Wiley & Sons Ltd.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

MPI-Based Simulation of the Shallow Water Model using the Finite Volume Characteristics Scheme 6

MPI-Based Simulation of the Shallow Water Model using the Fi...

引用

6th International Conference on Computer, Software and Modeling, ICCSM 2022

作者： Moussa, Ziggaf Imad, Kissami ENSAO LMCS Complexe Universitaire B.P. 669 Oujda60000 Morocco MSDA Mohammed VI Polytechnic University Lot 660 Ben Guerir43150 Morocco Universite Sorbonne Paris Nord LAGA CNRS UMR 7539 VilletaneuseF-93430 France

ISBN: (数字)9781665454865

ISBN: (纸本)9781665454865

A parallel algorithm for solving the 2D shallow water equations coupled with the convection-diffusion equation has been developed, in order to demonstrate the capability and performance of our parallel approach while maintaining very good accuracy of the results obtained. The numerical scheme used is written in a non-uniform triangular grid formalism, which allows for the complexity of the geometry of the computational domain used. This approach is based on both predictor and corrector stages. The predictor one uses the method of characteristics to reconstruct the numerical fluxes, whereas the corrector stage recovers the conservation equations. Numerical results are presented for a pollutant transport in a squared cavity. © 2022 IEEE.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

parallel computation of combinatorial symmetries 29

Parallel computation of combinatorial symmetries

引用

29th Annual European Symposium on algorithms, ESA 2021

作者： Anders, Markus Schweitzer, Pascal TU Darmstadt Germany

ISBN: (纸本)9783959772044

In practice symmetries of combinatorial structures are computed by transforming the structure into an annotated graph whose automorphisms correspond exactly to the desired symmetries. An automorphism solver is then employed to compute the automorphism group of the constructed graph. Such solvers have been developed for over 50 years, and highly efficient sequential, single core tools are available. However no competitive parallel tools are available for the task. We introduce a new parallel randomized algorithm that is based on a modification of the individualization-refinement paradigm used by sequential solvers. The use of randomization crucially enables parallelization. We report extensive benchmark results that show that our solver is competitive to state-of-the-art solvers on a single thread, while scaling remarkably well with the use of more threads. This results in order-of-magnitude improvements on many graph classes over state-of-the-art solvers. In fact, our tool is the first parallel graph automorphism tool that outperforms current sequential tools. © Markus Anders and Pascal Schweitzer;licensed under Creative Commons License CC-BY 4.0

关键词： Algorithm engineering Automorphism groups Graph isomorphism parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Data-parallel Hashing Techniques for GPU Architectures

引用

IEEE TRANSACTIONS ON parallel AND DISTRIBUTED SYSTEMS 2020年第1期31卷 237-250页

作者： Lessley, Brenton Childs, Hank Univ Oregon Dept Comp & Informat Sci Eugene OR 97403 USA

Hash tables are a fundamental data structure for effectively storing and accessing sparse data, with widespread usage in domains ranging from computer graphics to machine learning. This study surveys the state-of-the-art research on data-parallel hashing techniques for emerging massively-parallel, many-core GPU architectures. This survey identifies key factors affecting the performance of different techniques and suggests directions for further research.

关键词： Graphics processors hash tables parallel algorithms search problems

来源：评论

学校读者我要写书评

暂无评论

Anchored coreness: efficient reinforcement of social networks

Anchored coreness: efficient reinforcement of social network...

引用

作者： Linghu, Qingyuan Zhang, Fan Lin, Xuemin Zhang, Wenjie Zhang, Ying University of New South Wales Sydney Australia Guangzhou University Guangzhou China Centre for AI University of Technology Sydney Sydney Australia

The stability of a social network has been widely studied as an important indicator for both the network holders and the participants. Existing works on reinforcing networks focus on a local view, e.g., the anchored k-core problem aims to enlarge the size of the k-core with a fixed input k. Nevertheless, it is more promising to reinforce a social network in a global manner: considering the engagement of every user (vertex) in the network. Since the coreness of a user has been validated as the "best practice" for capturing user engagement, we propose and study the anchored coreness problem in this paper: anchoring a small number of vertices to maximize the coreness gain (the total increment of coreness) of all the vertices in the network. We prove the problem is NP-hard and show it is more challenging than the existing local-view problems. An efficient greedy algorithm is proposed with novel techniques on pruning search space and reusing the intermediate results. The algorithm is also extended to distributed environment with a novel graph partition strategy to ensure the computing independency of each machine. Extensive experiments on real-life data demonstrate that our model is effective for reinforcing social networks and our algorithms are efficient. © 2021, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

NEARLY WORK-EFFICIENT parallel ALGORITHM FOR DIGRAPH REACHABILITY

引用

SIAM JOURNAL ON COMPUTING 2020年第5期49卷 STOC18-500-STOC18-539页

作者： Fineman, Jeremy T. Georgetown Univ Dept Comp Sci Washington DC 20057 USA

One of the simplest problems on directed graphs is that of identifying the set of vertices reachable from a designated source vertex. This problem can be solved easily sequentially by performing a graph search, but efficient parallel algorithms have eluded researchers for decades. For sparse high-diameter graphs in particular, there is no known work-efficient parallel algorithm with nontrivial parallelism. This amounts to one of the most fundamental open questions in parallel graph algorithms: Is there a parallel algorithm for digraph reachability with nearly linear work? This article shows that the answer is yes, presenting a randomized parallel algorithm for digraph reachability and related problems with expected work o(m) and span (O) over tilde (n(2/3)), and hence parallelism (O) over tilde (m/n(2/3)) = (Omega) over tilde (n(1/3)), on any graph with n vertices and m arcs. This is the first parallel algorithm having both nearly linear work and strongly sublinear span, i.e., span (O) over tilde (n(1-is an element of)) for any constant is an element of > 0. The algorithm can be extended to produce a directed spanning tree, determine whether the graph is acyclic, topologically sort the strongly connected components of the graph, or produce a directed ear decomposition, all with work (O) over tilde (m) and span (O) over tilde (n(2/3)). The main technical contribution is an efficient Monte Carlo algorithm that, through the addition of a(n) shortcuts, reduces the diameter of the graph to (O) over tilde (n(2/3)) with high probability. While both sequential and parallel algorithms are known with those combinatorial properties, even the sequential algorithms are not efficient, having sequential runtime Omega(mn(Omega(1))). This article presents a surprisingly simple sequential algorithm that achieves the stated diameter reduction and runs in (O) over tilde (m) time. parallelizing that algorithm yields the main result, but doing so involves overcoming several other challen

关键词： parallel algorithms randomized algorithms graph search reachability shortcuts

来源：评论

学校读者我要写书评

暂无评论

NvPD: novel parallel edit distance algorithm, correctness, and performance evaluation

引用

CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS 2020年第2期23卷 879-894页

作者： Sadiq, Muhammad Umair Yousaf, Muhammad Murtaza Aslam, Laeeq Aleem, Muhammad Sarwar, Shahzad Jaffry, Syed Waqar Univ Punjab Punjab Univ Coll Informat Technol Lahore Pakistan Univ Punjab Punjab Univ Coll Informat Technol Comp Sci Lahore Turkey Capital Univ Sci & Technol Dept Comp Sci Islamabad Pakistan

Edit distance has applications in many domains such as bioinformatics, spell checking, plagiarism checking, query optimization, speech recognition, and data mining. Traditionally, edit distance is computed by dynamic programming based sequential solution which becomes infeasible for large problems. In this paper, we introduce NvPD, a novel algorithm for parallel edit distance computation by resolving dependencies in the conventional dynamic programming based solution. We also establish the correctness of modified dependencies. NvPD exhibits certain characteristics such as balanced workload among processors, less synchronization overhead, maximum utilization of resources and it can exploit spatial locality. It requiresmin(m,n)steps to complete as compared to diagonal based approach that completes inmax(m,n) Experimental evaluation using variety of random and real life data sets over shared memory multi-core systems and graphic processing units (GPUs) show that NvPD outperforms state-of-the-art parallel edit distance algorithms.

关键词： Edit distance Dynamic programming parallel algorithms Performance evaluation GPUs OpenMP

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：