检索结果-内蒙古大学图书馆

Electromagnetic transient parallel simulation optimisation based on GPU

JOURNAL OF ENGINEERING-JOE 2019年第16期2019卷 1737-1742页

作者： Yao, Shujun Zhang, Shuo Guo, Wanhua North China Elect Power Univ Sch Elect & Elect Engn Beijing Peoples R China

The development of smart grid and the increasing scale of power system bring much pressure to the electromagnetic transient simulation of a power system. The graphic processing unit (GPU), which features the massive concurrent threads and excellent floating point performance, brings a new chance to the area of power system simulation. This study introduces a parallel lower triangular and upper triangular decomposition algorithm and calculation strategy of electromagnetic transient simulation based on GPU. In this scheme, the GPU is mainly used to do the computationally intensive part of the simulation in parallel on its built-in multiple processing cores, and the CPU is assigned for updating history terms and flow control of the simulation. By comparing with the results simulating by the CPU-only implementations, the validity and efficiency of the proposed method are verified.

关键词： parallel algorithms smart power grids power system simulation optimisation EMTP graphics processing units microprocessor chips multi-threading CPU-only implementations history terms flow control built-in multiple processing cores floating point performance parallel LU decomposition algorithm power system simulation massive concurrent threads graphic processing unit smart grid GPU electromagnetic transient parallel simulation optimisation

来源：评论

学校读者我要写书评

暂无评论

Numerical Modeling of the Seismic Influence on an Underwater Composite Oil Pipeline

引用

Mathematical Models and Computer Simulations 2019年第5期11卷 715-721页

作者： Beklemysheva, K.A. Golubev, V.I. Vasyukov, A.V. Petrov, I.B. Keldysh Institute of Applied Mathematics Russian Academy of Sciences Moscow Russian Federation Moscow Institute of Physics and Technology (State University) Dolgoprudny Russian Federation

Abstract: The problem of numerical modeling of the process of initiating seismic activity on the shelf and its destructive effect on composite oil pipelines laid along the seabed is considered. To describe the dynamic behavior of the medium, the determining systems of equations of the theory of elasticity and acoustics with explicit distinguishing all layers are used. The polymeric composite material of the pipeline is described in a visco-elastic orthotropic model. An algorithm that allows estimating the volume of oil pipeline destroyed at the specified level of seismic activity and strength characteristics of the composite is proposed. A distinctive feature of the developed approach involves splitting the problem into two stages: the full wave calculation of the propagation of seismic waves from the earthquake source to the day surface and the calculation of a composite pipeline element as a complexly shaped object of anisotropic material. For the numerical calculation, the grid-characteristic method is used for hexahedral and tetrahedral computational grids. © 2019, Pleiades Publishing, Ltd.

关键词： composite material continuous mechanics destruction of composites earthquake grid-characteristic method numerical modeling parallel algorithms seismic resistance

来源：评论

学校读者我要写书评

暂无评论

Low Rank Methods of Approximation in an Electromagnetic Problem

引用

LOBACHEVSKII JOURNAL OF MATHEMATICS 2019年第11期40卷 1771-1780页

作者： Aparinov, A. A. Setukha, A. V. Stavtsev, S. L. Cent Aerohydrodynam Inst TsAGI Zhukovskii 140180 Moscow Oblast Russia Lomonosov Moscow State Univ Moscow 119991 Russia Russian Acad Sci Marchuk Inst Numer Math Moscow 119333 Russia

In this article authors present a new method to construct low-rank approximations of dense huge-size matrices. The method develops mosaic-skeleton method and belongs to kernel-independent methods. In distinction from a mosaic-skeleton method, the new one utilizes the hierarchical structure of matrix not only to define matrix block structure but also to calculate factors of low-rank matrix representation. The new method was applied to numerical calculation of boundary integral equations that appear from 3D problem of scattering monochromatic electromagnetic wave by ideal-conducting bodies. The solution of model problem is presented as an example of method evaluation.

关键词： parallel algorithms fast matrix algorithms integral equations scattering problems

来源：评论

学校读者我要写书评

暂无评论

An algebraic perspective on the convergence of vector-based routing protocols

An algebraic perspective on the convergence of vector-based ...

引用

作者： Daggitt, Matthew Lucian University of Cambridge

学位级别：博士

This thesis studies the properties of vector-based routing protocols whose underlying algebras are strictly increasing. Strict increasingness has previously been shown to be both a sufficient and a necessary condition for the convergence of path-vector protocols. One of the key contributions of this thesis is to link vector-based routing to a much larger family of asynchronous iterative algorithms. This unlocks a significant body of existing theory, and allows asynchronous protocols to be proved correct by purely synchronous reasoning. As well as applying it to routing protocols, this thesis advances the asynchronous theory in two ways. Firstly it shows that the existing conditions required for convergence may be relaxed. Secondly it proposes the first model for ``dynamic'' asynchronous processes in which both the problem being solved and the set of participants change over time. The thesis' attention then turns to models of routing problems, and presents a new algebraic structure that is simpler and more expressive than the state of the art. In particular this structure is capable of modelling routing problems that underlie both distance-vector and path-vector protocols. Consequently these two families of vector-based protocols may be unified for the first time. The new structure is also capable of modelling protocols that use path-dependent conditional policy. Next the work above is used to construct a model of an abstract vector-based protocol. This is then used in the first proof of correctness for strictly increasing distance-vector protocols and a new proof of correctness for strictly increasing path-vector protocols. The latter is an improvement over previous results as it i) proves that convergence is deterministic ii) does not assume reliable communication between nodes and iii) applies to path-vector protocols with path-dependent conditional policy. The long standing question of the worst-case rate of convergence for a strictly increasing path-vector proto

关键词： Routing protocols Agda Asynchronous iterations parallel algorithms Formal verification Algebra Convergence Vector-based routing Path-vector Distance-vector Routing BGP RIP

来源：评论

学校读者我要写书评

暂无评论

Performance Comparison on parallel CPU and GPU algorithms for Unified Gas-Kinetic Scheme

arXiv

引用

arXiv 2018年

作者： Liu, Jizhou Hu, Fang Q. Li, Xiaodong School of Energy and Power Engineering Beihang University Beijing100191 Department of Mathematics and Statistics Old Dominion University NorfolkVA23529

parallel algorithms on CPU and GPU are implemented for the Unified Gas-Kinetic Scheme and their performances are investigated and compared by a two dimensional channel flow case. The parallel CPU algorithm has a one dimensional block partition that parallelizes only the spatial space. Due to the intrinsic feature of the UGKS, a compromised two-level parallelization is adopted for GPU algorithm. A series of meshes with different sizes are tested to reveal the performance evolution of the algorithms with respect to problem size. Then special attentions are paid to UGKS applications where the molecular velocity space range is large. The comparison confirms that GPU has relative elevated accelerations with the latest device having a speedup of 118.38x. parallel CPU algorithm, on the contrary, might provide better performances when the grid point number in velocity space is large. Copyright © 2018, The Authors. All rights reserved.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

A cooperative GPU-based parallel Multistart Simulated Annealing algorithm for Quadratic Assignment Problem

引用

ENGINEERING SCIENCE AND TECHNOLOGY-AN INTERNATIONAL JOURNAL-JESTECH 2018年第5期21卷 843-849页

作者： Sonuc, Emrullah Sen, Baha Bayir, Safak Karabuk Univ Dept Comp Engn Karabuk Turkey Yildirim Beyazit Univ Dept Comp Engn Ankara Turkey

GPU hardware and CUDA architecture provide a powerful platform to develop parallel algorithms. Implementation of heuristic and metaheuristic algorithms on GPUs are limited in literature. Nowadays developing parallel algorithms on GPU becomes very important. In this paper, NP-Hard Quadratic Assignment Problem (QAP) that is one of the combinatorial optimization problems is discussed. parallel Multistart Simulated Annealing (PMSA) method is developed with CUDA architecture to solve QAP. An efficient method is developed by providing multistart technique and cooperation between threads. The cooperation is occurred with threads in both the same and different blocks. This paper focuses on both acceleration and quality of solutions. Computational experiments conducted on many Quadratic Assignment Problem Library (QAPLIB) instances. The experimental results show that PMSA runs up to 29x faster than a single-core CPU and acquires best known solution in a short time in many benchmark datasets. (C) 2018 Karabuk University. Publishing services by Elsevier B.V.

关键词： Combinatorial optimization CUDA GPU Multistart Simulated Annealing parallel algorithms Quadratic Assignment Problem

来源：评论

学校读者我要写书评

暂无评论

Sparsifying distributed algorithms with ramifications in massively parallel computation and centralized local computation

arXiv

引用

arXiv 2018年

作者： Ghaffari, Mohsen Uitto, Jara ETH Zurich ETH Zurich U. of Freiburg

We introduce a method for "sparsifying" distributed algorithms and exhibit how it leads to improvements that go past known barriers in two algorithmic settings of large-scale graph processing: Massively parallel Computation (MPC), and Local Computation algorithms (LCA). MPC with Strongly Sublinear Memory: Recently, there has been growing interest in obtaining MPC algorithms that are faster than their classic O(log n)-round parallel (PRAM) counterparts for problems such as Maximal Independent Set (MIS), Maximal Matching, 2-Approximation of Minimum Vertex Cover, and (1+ϵ)-Approximation of Maximum Matching. Currently, all such MPC algorithms require memory of Ω(n) per machine: Czumaj et al. [STOC'18] were the first to handle Ω(n) memory, running in O((log log n)2) rounds, who improved on the n1+Ω(1)memory requirement of the O(1)-round algorithm of Lattanzi et al [SPAA'11]. We obtain Õ(√logΔ)-round MPC algorithms for all these four problems that work even when each machine has strongly sublinear memory, e.g., nαfor any constant α ∈ (0, 1). Here, Δ denotes the maximum degree. These are the first sublogarithmic-time MPC algorithms for (the general case of) these problems that break the linear memory barrier. LCAs with Query Complexity Below the Parnas-Ron Paradigm: Currently, the best known LCA for MIS has query complexity ΔO(logΔ)poly(log n), by Ghaffari [SODA'16], which improved over the ΔO(log2Δ)poly(log n) bound of Levi et al. [Algorithmica'17]. As pointed out by Rubinfeld, obtaining a query complexity of poly(Δlog n) remains a central open question. Ghaffari's bound almost reaches a ΔΩ(logΔ/log logΔ)barrier common to all known MIS LCAs, which simulate a distributed algorithm by learning the full local topology, à la Parnas-Ron [TCS'07]. There is a barrier because the distributed complexity of MIS has a lower bound of Ω(logΔ/loglogΔ), by results of Kuhn, et al. [JACM'16], which means this methodology cannot go below query complexity ΔΩ(logΔ/log logΔ). We break this ba

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

parallel mining of uncertain data using segmentation of data set area and Voronoi diagrams

引用

AUTOMATIKA 2018年第3-4期59卷 349-356页

作者： Lukic, Ivica Hocenski, Zeljko Kohler, Mirko Galba, Tomislav Josip Juraj Strossmayer Univ Osijek Fac Elect Engn Comp Sci & Informat Technol Osijek Dept Comp Engn & Automat Osijek Croatia

Clustering of uncertain objects in large uncertain databases and problem of mining uncertain data has been well studied. In this paper, clustering of uncertain objects with location uncertainty is studied. Moving objects, like mobile devices, report their locations periodically, thus their locations are uncertain and best described by a probability density function. The number of objects in a database can be large which makes the process of mining accurate data, a challenging and time consuming task. Authors will give an overview of existing clustering methods and present a new approach for data mining and parallel computing of clustering problems. All existing methods use pruning to avoid expected distance calculations. It is required to calculate the expected distance numerical integration, which is time-consuming. Therefore, a new method, called Segmentation of Data Set Area-parallel, is proposed. In this method, a data set area is divided into many small segments. Only clusters and objects in that segment are observed. The number of segments is calculated using the number and location of clusters. The use of segments gives the possibility of parallel computing, because segments are mutually independent. Thus, each segment can be computed on multiple cores.

关键词： Clustering algorithms data mining data uncertainty Euclidean distance parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

High-Performance Quasi-Monte Carlo Integration and Applications

High-Performance Quasi-Monte Carlo Integration and Applicati...

引用

作者： Ahmed Hassan H. Almulihi Western Michigan University

学位级别：博士

While adaptive integration by region partitioning is generally effective in low dimensions, quasi-Monte Carlo methods can be used for integral approximations in moderate to high dimensions. Important application areas include high-energy physics, statistics, computational finance and stochastic geometry with applications in robotics, tessellations and imaging from medical data using tetrahedral meshes. Lattice rule integration is a class of quasi-Monte Carlo methods, implemented by an equal-weight cubature formula and suited for fairly smooth functions. Successful methods to construct these rules are the component-by-component (CBC) algorithm by Sloan and Restsov (2001) and the fast algorithm for CBC by Nuyens and Cools (2006). As the ability to invoke a large number of function evaluations is an important factor in high-dimensional integration, we investigate the acceleration of the CBC construction for large rank-1 lattice rules using the CUDA (cuFFT) Fast Fourier Transform procedure. A major part of this study is the development of high-performance lattice rule algorithms for approximating moderate- to high-dimensional integrals on GPUs. Lattice rules are incorporated with a periodizing transformation. We show that rank-1 lattice rules on GPUs (possibly with an integral transformation to alleviate the effects of boundary singularities) yield better accuracy and efficiency for various classes of integrals compared to classic Monte Carlo and adaptive methods. The computational power of GPU accelerators also leads to significant improvements in efficiency and accuracy for integration based on embedded (composite) lattices. These methods have been motivated as possible contributions to high-performance computing software such as the ParInt multivariate integration package developed at WMU. We further show an application in Bayesian analysis, leading to a class of problems where the integrand has a dominant peak in the integration domain. We demonstrate a black-box ap

关键词： parallel algorithms lattice rules Monte Carlo multivariate integration Bayesian inference Feynman loop diagrams

来源：评论

学校读者我要写书评

暂无评论

Brief announcement: Integrating temporal information to spatial information in a neural circuit 33

Brief announcement: Integrating temporal information to spat...

引用

33rd International Symposium on Distributed Computing, DISC 2019

作者： Lynch, Nancy Wang, Mien Brabeeba Massachusetts Institute of Technology CambridgeMA United States Massachusetts Institute of Technology CambridgeMA United States

ISBN: (纸本)9783959771269

In this paper, we consider networks of deterministic spiking neurons, firing synchronously at discrete times. We consider the problem of translating temporal information into spatial information in such networks, an important task that is carried out by actual brains. Specifically, we define two problems: "First Consecutive Spikes Counting" and "Total Spikes Counting", which model temporal-coding and rate-coding aspects of temporal-to-spatial translation respectively. Assuming an upper bound of T on the length of the temporal input signal, we design two networks that solve two problems, each using O(log T) neurons and terminating in time T + 1. We also prove that these bounds are tight. © Nancy Lynch and Mien Brabeeba Wang.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：