检索结果-内蒙古大学图书馆

parallel algorithm on Graphics Processing Unit for Harmonic Minimization in Multilevel Inverters

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS 2015年第3期11卷 700-707页

作者： Roberge, Vincent Tarbouchi, Mohammed Labonte, Gilles Royal Mil Coll Canada Dept Elect & Comp Engn Kingston ON K7K7L6 Canada Royal Mil Coll Canada Dept Math & Comp Sci Kingston ON K7K7L6 Canada

This paper presents the implementation details of a parallel algorithm on graphics processing units (GPUs) to compute the optimal switching angles for the harmonic minimization in multilevel inverters with unequal dc voltage sources. Two algorithms, the Newton- Raphson method and the bisection method, and three different parallel implementations are investigated. Both algorithms considered have a low time complexity and offer a superior converging rate allowing for the real- time control of inverters with a very large number of levels. By exploiting the massively parallel architecture of GPUs, the execution time of the program is reduced significantly. The proposed parallel implementation offers a maximum speedup of 534x compared with a sequential execution on CPU, and allows for the calculation of the optimal switching angles for inverters with up to 1000 dc sources in less than 16.4 mu s.

关键词： Graphics processing unit (GPU) harmonic minimization multilevel inverter parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

parallel algorithm for finding modules of large-scale coherent fault trees

引用

MICROELECTRONICS RELIABILITY 2015年第9-10期55卷 1400-1403页

作者： Li, Z. F. Ren, Y. Liu, L. L. Wang, Z. L. Beihang Univ Sch Reliabil & Syst Engn Beijing 100191 Peoples R China

The computation of the probability of the top event or minimal cut sets of fault trees is known as intractable NP-hard problems. Modularization can be used to reduce the computational cost of basic operations on fault trees efficiently. The idea of the linear time algorithm, as a very efficient and compact modules detecting algorithm, is visiting the nodes one by one with top-down depth-first left-most traversal of the tree. So the efficiency of the linear time algorithm is limited by nodes visiting time successively and serially, especially when confronting large-scale fault trees. Aiming at improving the efficiency of modularizing large-scale fault trees, this paper proposes a new parallel method to find all possible modules. Firstly, we transform the fault tree into a directed acyclic graph (DAG) and treat the terminal basic nodes as entries of the algorithm. And then, according to the proposed rules in this paper, we traverse the graph bottom-up from the terminal nodes and mark the internal nodes in a parallel way. Therefore, we can compare all internal nodes and decide which nodes are modules. Eventually, an experiment is carried out to compare the linear and parallel algorithm, and the result shows that the proposed parallel algorithm is efficient on handling large-scale fault trees. (C) 2015 Elsevier Ltd. All rights reserved.

关键词： Modularization parallel algorithm Fault tree Directed acyclic graph

来源：评论

学校读者我要写书评

暂无评论

parallel algorithm of a modified surface modeling method and its application in digital elevation model construction

引用

ENVIRONMENTAL EARTH SCIENCES 2015年第8期74卷 6551-6561页

作者： Zhao, Mingwei Yue, Tianxiang Zhao, Na Yang, Xin Wang, Yifu Zhang, Xingying Chinese Acad Sci State Key Lab Resources & Environm Informat Syst Inst Geog Sci & Nat Resources Res Beijing 10010 Peoples R China Univ Chinese Acad Sci Beijing 100049 Peoples R China China Meteorol Adm Natl Satellite Meteorol Ctr Beijing 100081 Peoples R China Univ Jinan Quancheng Coll Penglai 265600 Peoples R China

High accuracy surface modeling (HASM) has been proved to be a superior method for surface simulation compared to classical interpolation methods. However, the fact that HASM is time consuming combined with its dependence on its driving field restricts its application in large area problems. This research develops a modified HASM which can get rid of the driving field in the surface simulation, and the parallel version of the modified HASM is also proposed with the purpose of improving its computational efficiency. Light detection and ranging (LIDAR) data are used as an optimum constraint to construct digital elevation model (DEM). Tests show that the modified HASM can perform surface simulation successfully without the driving field. And it also shows that the simulation accuracy of the modified HASM is almost the same as the old HASM and the classical interpolation methods when the sampling rate is larger than 0.5 %, while the modified HASM shows significantly increased simulation accuracy as the sampling rate decreases. This characteristic indicates that the modified HASM no longer relies on the driving field in the surface simulation. And it also improves the simulation accuracy compared to the old HASM and the classical interpolation methods. Tests of parallel efficiency show that the master-slave mode used in the parallel algorithm obtains a satisfactory result, indicating that the HASM can be applied to surface simulation of large area problems. And it also shows that the modified HASM would have great potential where applied in high-resolution DEM and digital surface model (DSM) construction from LIDAR data.

关键词： HASM DEM LIDAR parallel algorithm Simulation accuracy

来源：评论

学校读者我要写书评

暂无评论

parallel algorithms for Data Digital Filtering

引用

CYBERNETICS AND SYSTEMS ANALYSIS 2023年第1期59卷 39-48页

作者： Yadzhak, M. S. Natl Acad Sci Ukraine Pidstryhach Inst Appl Problems Mech & Math Lvov Ukraine Ivan Franko Natl Univ Lviv Lvov Ukraine

The paper proposes parallel algorithms for solving digital filtering problems of different dimensions using modern universal computers. Theoretical estimates of the complexity and speedup are obtained, which confirm the high efficiency of these algorithms. Some of the proposed parallel algorithms are implemented using computers with a multi-core processor, and real estimates of the speedup are obtained, which agree well with theoretical ones.

关键词： digital filtering parallel algorithm computation speedup limited parallelism equivalence of algorithms computing system

来源：评论

学校读者我要写书评

暂无评论

A FAST parallel algorithm FOR FINDING THE LARGEST COMMON 4-CONNECTED COMPONENT FROM TWO MATRICES

引用

TEHNICKI VJESNIK-TECHNICAL GAZETTE 2016年第4期23卷 979-984页

作者： Gao, Ying Liu, Haoshen Huang, Jiancong Duan, Jiajie Mu, Lei South China Univ Technol Sch Comp Sci & Engn Waihuan Dong Rd 382 Guangzhou Guangdong Peoples R China YunNan Elect Power Test & Res Inst Grp CO Ltd Kunming Peoples R China Huanggang Dong Rd Jinan Shangdong Peoples R China

We describe a new design of parallel algorithm for solving the two-dimensional longest common substring (2D LCS) problem, taking advantage of the multi-core graphic processing unit architecture offered by Compute Unified Device Architecture (CUDA). In this article we also define the 2D LCS problem as finding the largest common 4-connected component from two input matrices and present an algorithm which can exactly solve this problem in 0 (mnst/P) time with a P-core GPU.

关键词： CUDA largest common 4-connected component parallel algorithm 2DLCS

来源：评论

学校读者我要写书评

暂无评论

Super-Exponentially Convergent parallel algorithm for Eigenvalue Problems with Fractional Derivatives

引用

COMPUTATIONAL METHODS IN APPLIED MATHEMATICS 2016年第4期16卷 633-652页

作者： Demkiv, Ihor Gavrilyuk, Ivan P. Makarov, Volodymyr L. NAS Ukraine Inst Math 3 Tereshchenkivska Str UA-01601 Kiev 4 Ukraine Univ Cooperat Educ Eisenach Wartenberg 2 D-99817 Eisenach Germany

A new algorithm for eigenvalue problems for linear differential operators with fractional derivatives is proposed and justified. The algorithm is based on the approximation (perturbation) of the coefficients of a part of the differential operator by piecewise constant functions where the eigenvalue problem for the last one is supposed to be simpler than the original one. Another milestone of the algorithm is the homotopy idea which results at the possibility for a given eigenpair number to compute recursively a sequence of the approximate eigenpairs. This sequence converges to the exact eigenpair with a super-exponential convergence rate. The eigenpairs can be computed in parallel for all prescribed indexes. The proposed method possesses the following principal property: its convergence rate increases together with the index of the eigenpair. Numerical examples confirm the theory.

关键词： Fractional Differential Operator Eigenvalue Problem Homotopy Idea parallel algorithm Super-Exponentially Convergent algorithm

来源：评论

学校读者我要写书评

暂无评论

DyG-DPCD: A Distributed parallel Community Detection algorithm for Large-Scale Dynamic Graphs

引用

INTERNATIONAL JOURNAL OF parallel PROGRAMMING 2025年第1期53卷 1-28页

作者： Sattar, Naw Safrin Ibrahim, Khaled Z. Buluc, Aydin Arifuzzaman, Shaikh Oak Ridge Natl Lab Oak Ridge TN 37831 USA Lawrence Berkeley Natl Lab Berkeley CA 94720 USA Univ Nevada Las Vegas NV 89154 USA

Dynamic (Temporal) graphs capture the valuable evolution of real-world systems, from the continuously evolving patterns of social interactions and genetic pathways to the dynamic fluctuations of economic forces. Detecting communities for such evolving networks poses unique challenges. Detecting and analyzing the evolution of communities within dynamic graphs unlocks valuable insights into the underlying structural and temporal patterns of real-world systems. However, the sheer volume of modern graph data and the inherent complexity of the temporal dimension pose significant challenges to scalable community detection algorithms. Addressing this gap, our work explores the limited landscape of scalable distributed-memory parallel methods specifically designed for dynamic network community detection. We propose a novel parallel algorithm, DyG-DPCD (Dynamic Graph Distributed parallel Community Detection), to detect communities in dynamic networks using the Message Passing Interface (MPI) framework. We present a vertex-centric approach, allowing us to detect communities through local optimization. Furthermore, we enhance our baseline algorithm by incorporating three heuristics, which improve the algorithm's performance significantly while maintaining the quality of the solutions. We demonstrate the efficiency of our algorithm by experimenting on several real-world large-scale networks with hundreds of millions of edges spanning diverse domains. Notably, DyG-DPCD achieves speedups between 25x\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$25\times$$\end{document} and 30x\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}

关键词： Dynamic graphs Temporal graphs Community detection parallel algorithm Distributed-memory MPI

来源：评论

学校读者我要写书评

暂无评论

parallel algorithms for minimum general partial dominating set and maximum budgeted dominating set in unit disk graph

引用

THEORETICAL COMPUTER SCIENCE 2022年 932卷 13-20页

作者： Hong, Weizhi Ran, Yingli Zhang, Zhao Zhejiang Normal Univ Jinhua Coll Math & Comp Sci Jinhua 321004 Zhejiang Peoples R China

In a minimum general partial dominating set problem (MinGPDS), given a graph G = ( V, E), a profit function p : V -> R+ and a threshold K, the goal is to find a minimum subset of vertices D subset of V such that the total profit of those vertices dominated by D is at least K(a vertex is dominated by D if it is either in D or has at least one neighbor in D). In a maximum general budgeted dominating set problem (MaxGBDS), given a budget B, the goal is to find a vertex set D with at most B vertices such that the total profit of those vertices dominated by D is as large as possible. We present the first parallel algorithms for MinGPDS and MaxGBDS in unit disk graphs. They both run in O(logn) rounds on O(n) machines, and achieve constant approximation ratios. (c) 2022 Elsevier B.V. All rights reserved.

关键词： Partial dominating set Budgeted dominating set Unit disk graph parallel algorithm Approximation ratio

来源：评论

学校读者我要写书评

暂无评论

Swpmmas: an optimized parallel max-min ant system algorithm based on the SW26010-pro processor

引用

JOURNAL OF SUPERCOMPUTING 2025年第1期81卷 1-28页

作者： Tian, Min Xu, Chaoshuai Wu, Xiaoming Pan, Jingshan Guo, Ying Du, Wei Wei, Zhenguo Qilu Univ Technol Key Lab Comp Power Network & Informat Secur Minist EducShandong Comp Sci Ctr Natl Supercomp Ctr JinanShandong Acad Sci Jinan Peoples R China Jinan Inst Supercomp Technol Jinan Key Lab High Performance Comp Jinan Peoples R China Shandong Fundamental Res Ctr Comp Sci Shandong Prov Key Lab Comp Networks Jinan Peoples R China

The max-min ant system (MMAS) algorithm has found extensive application in tackling combinatorial optimization challenges such as the traveling salesman problem (TSP), production scheduling, and quadratic assignment. Nevertheless, as the scale of the problem increases, the MMAS algorithm gradually encounters performance limitations. To address the performance constraints of MMAS, we propose a parallel max-min ant system (PMMAS) algorithm, where a master subpopulation coordinates multiple subpopulations in parallel search. Furthermore, to facilitate the parallel acceleration of computationally intensive tasks in PMMAS using the CPE array of the SW26010-Pro processor, the selection weight calculation equation in the traditional MMAS algorithm was improved. This improvement led to the introduction of the Sunway parallel max-min ant system (SWPMMAS) algorithm, which implements parallelism using MPI and Athread. The revised selection weight calculation equation is also applicable to the traditional MMAS algorithm and enhances its running speed. Finally, the SWPMMAS algorithm was evaluated using various TSP instances, with city counts ranging from 51 to 11,849. The results demonstrate that the SWPMMAS algorithm provides excellent solutions. For TSP instances with more than 10,000 cities, the SWPMMAS algorithm achieves over 13x\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times$$\end{document} speedup compared to the PMMAS algorithm running on the Sunway architecture and 5.4x\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times$$\end{document} speedup compared to the PMMAS algorithm running on a commercial Shanh

关键词： Max-min ant system TSP SW26010-Pro architecture parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

A parallel high-order accuracy algorithm for the Helmholtz equations

引用

INTERNATIONAL JOURNAL OF COMPUTER MATHEMATICS 2024年第1期101卷 56-94页

作者： Bao, Tiantian Feng, Xiufang Ningxia Univ Sch Math & Stat Yinchuan 750021 Peoples R China

The numerical solution of the Helmholtz equations is challenging to compute when the wave numbers contained in the governing equation are large. In this paper, we present a parallel algorithm for this problem. A class of sixth-order hybrid compact finite-difference schemes for the Helmholtz equations is presented based on the Taylor expansion. To improve the efficiency of solving the large-wave-number problem, we implemented a parallel algorithm based on the Message Passing Interface environment to solve the discrete system. The validity and accuracy of the proposed method are verified by numerical examples. The method is also applicable to solving problems with oscillatory solutions, which are characterized by numerical instability as the wave number increases.

关键词： Helmholtz equation sixth-order hybrid compact finite-difference scheme parallel algorithm MPI

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：