The method of Brownian configuration fields (BCF) is a promising multi-scale approach for the simulation of viscoelastic fluids, however, it is a computationally expensive method, which restricts its application in co...
详细信息
ISBN:
(纸本)9781509021406
The method of Brownian configuration fields (BCF) is a promising multi-scale approach for the simulation of viscoelastic fluids, however, it is a computationally expensive method, which restricts its application in complex scenarios. Therefore, it is of great importance to optimize the parallel implementation in order to improve computational efficiency. In this paper we propose a hybrid decomposition parallel algorithm named : MCDPar, which enables the simulation problem to be decomposed simultaneously over mesh cells and the Brownian configuration fields. Compution processes are split into multiple groups and Brownian configuration fields are equally associated with these groups. Meanwhile, within each group the processes are concurrently executed based on the traditional mesh decomposition approach. Finally we implemented the MCDPar algorithm in a micro-macro numerical solver based on OpenFOAM. Experimental results show that the micro-macro simulation time of viscoelastic fluids is significantly reduced with improved scalability and parallel efficiency. In the test case with N-f = 2000 and N-cell = 262144, the speedup of the MCDPar is up to 9.23x with a 7.5x increase in number of cores compared to the original parallel algorithm.
With the fast development of information technology, the power data is growing at an exponentially speed. In the face of multi-dimensional and complicated power network data, the performance of the traditional cluster...
详细信息
ISBN:
(纸本)9789811064425;9789811064418
With the fast development of information technology, the power data is growing at an exponentially speed. In the face of multi-dimensional and complicated power network data, the performance of the traditional clustering algorithms are not satisfied. How to effectively cope with the power network data is becoming a hot topic. This paper proposes a parallel implement of K-means clustering algorithm based on Hadoop distributed file system and Mapreduce distributed computing framework to deal this problem. The experimental results show that the performance of our proposed algorithm significantly outperforms the traditional clustering algorithm and the parallel clustering algorithm can significantly reduce the time complexity and can be applied in analyzing and mining of the power network data.
Solving the pickup and delivery problem with time windows (PDPTW) is a vital research topic due to its NP-hardness and its numerous practical applications. In this paper, we propose an island-model parallel memetic al...
详细信息
ISBN:
(纸本)9781509060580
Solving the pickup and delivery problem with time windows (PDPTW) is a vital research topic due to its NP-hardness and its numerous practical applications. In this paper, we propose an island-model parallel memetic algorithm for minimizing the distance in the PDPTW. In this algorithm, the processes execute the same memetic algorithm and co-operate to guide the optimization efficiently. An extensive experimental study revealed that the MPI implementation of the proposed approach retrieves very high-quality routing schedules. The analysis is coupled with appropriate statistical tests.
This paper presents the implementation details of a parallel algorithm on graphics processing units (GPUs) to compute the optimal switching angles for the harmonic minimization in multilevel inverters with unequal dc ...
详细信息
This paper presents the implementation details of a parallel algorithm on graphics processing units (GPUs) to compute the optimal switching angles for the harmonic minimization in multilevel inverters with unequal dc voltage sources. Two algorithms, the Newton- Raphson method and the bisection method, and three different parallel implementations are investigated. Both algorithms considered have a low time complexity and offer a superior converging rate allowing for the real- time control of inverters with a very large number of levels. By exploiting the massively parallel architecture of GPUs, the execution time of the program is reduced significantly. The proposed parallel implementation offers a maximum speedup of 534x compared with a sequential execution on CPU, and allows for the calculation of the optimal switching angles for inverters with up to 1000 dc sources in less than 16.4 mu s.
The computation of the probability of the top event or minimal cut sets of fault trees is known as intractable NP-hard problems. Modularization can be used to reduce the computational cost of basic operations on fault...
详细信息
The computation of the probability of the top event or minimal cut sets of fault trees is known as intractable NP-hard problems. Modularization can be used to reduce the computational cost of basic operations on fault trees efficiently. The idea of the linear time algorithm, as a very efficient and compact modules detecting algorithm, is visiting the nodes one by one with top-down depth-first left-most traversal of the tree. So the efficiency of the linear time algorithm is limited by nodes visiting time successively and serially, especially when confronting large-scale fault trees. Aiming at improving the efficiency of modularizing large-scale fault trees, this paper proposes a new parallel method to find all possible modules. Firstly, we transform the fault tree into a directed acyclic graph (DAG) and treat the terminal basic nodes as entries of the algorithm. And then, according to the proposed rules in this paper, we traverse the graph bottom-up from the terminal nodes and mark the internal nodes in a parallel way. Therefore, we can compare all internal nodes and decide which nodes are modules. Eventually, an experiment is carried out to compare the linear and parallel algorithm, and the result shows that the proposed parallel algorithm is efficient on handling large-scale fault trees. (C) 2015 Elsevier Ltd. All rights reserved.
High accuracy surface modeling (HASM) has been proved to be a superior method for surface simulation compared to classical interpolation methods. However, the fact that HASM is time consuming combined with its depende...
详细信息
High accuracy surface modeling (HASM) has been proved to be a superior method for surface simulation compared to classical interpolation methods. However, the fact that HASM is time consuming combined with its dependence on its driving field restricts its application in large area problems. This research develops a modified HASM which can get rid of the driving field in the surface simulation, and the parallel version of the modified HASM is also proposed with the purpose of improving its computational efficiency. Light detection and ranging (LIDAR) data are used as an optimum constraint to construct digital elevation model (DEM). Tests show that the modified HASM can perform surface simulation successfully without the driving field. And it also shows that the simulation accuracy of the modified HASM is almost the same as the old HASM and the classical interpolation methods when the sampling rate is larger than 0.5 %, while the modified HASM shows significantly increased simulation accuracy as the sampling rate decreases. This characteristic indicates that the modified HASM no longer relies on the driving field in the surface simulation. And it also improves the simulation accuracy compared to the old HASM and the classical interpolation methods. Tests of parallel efficiency show that the master-slave mode used in the parallel algorithm obtains a satisfactory result, indicating that the HASM can be applied to surface simulation of large area problems. And it also shows that the modified HASM would have great potential where applied in high-resolution DEM and digital surface model (DSM) construction from LIDAR data.
The paper proposes parallel algorithms for solving digital filtering problems of different dimensions using modern universal computers. Theoretical estimates of the complexity and speedup are obtained, which confirm t...
详细信息
The paper proposes parallel algorithms for solving digital filtering problems of different dimensions using modern universal computers. Theoretical estimates of the complexity and speedup are obtained, which confirm the high efficiency of these algorithms. Some of the proposed parallel algorithms are implemented using computers with a multi-core processor, and real estimates of the speedup are obtained, which agree well with theoretical ones.
Dynamic (Temporal) graphs capture the valuable evolution of real-world systems, from the continuously evolving patterns of social interactions and genetic pathways to the dynamic fluctuations of economic forces. Detec...
详细信息
Dynamic (Temporal) graphs capture the valuable evolution of real-world systems, from the continuously evolving patterns of social interactions and genetic pathways to the dynamic fluctuations of economic forces. Detecting communities for such evolving networks poses unique challenges. Detecting and analyzing the evolution of communities within dynamic graphs unlocks valuable insights into the underlying structural and temporal patterns of real-world systems. However, the sheer volume of modern graph data and the inherent complexity of the temporal dimension pose significant challenges to scalable community detection algorithms. Addressing this gap, our work explores the limited landscape of scalable distributed-memory parallel methods specifically designed for dynamic network community detection. We propose a novel parallel algorithm, DyG-DPCD (Dynamic Graph Distributed parallel Community Detection), to detect communities in dynamic networks using the Message Passing Interface (MPI) framework. We present a vertex-centric approach, allowing us to detect communities through local optimization. Furthermore, we enhance our baseline algorithm by incorporating three heuristics, which improve the algorithm's performance significantly while maintaining the quality of the solutions. We demonstrate the efficiency of our algorithm by experimenting on several real-world large-scale networks with hundreds of millions of edges spanning diverse domains. Notably, DyG-DPCD achieves speedups between 25x\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$25\times$$\end{document} and 30x\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}
In a minimum general partial dominating set problem (MinGPDS), given a graph G = ( V, E), a profit function p : V -> R+ and a threshold K, the goal is to find a minimum subset of vertices D subset of V such that th...
详细信息
In a minimum general partial dominating set problem (MinGPDS), given a graph G = ( V, E), a profit function p : V -> R+ and a threshold K, the goal is to find a minimum subset of vertices D subset of V such that the total profit of those vertices dominated by D is at least K(a vertex is dominated by D if it is either in D or has at least one neighbor in D). In a maximum general budgeted dominating set problem (MaxGBDS), given a budget B, the goal is to find a vertex set D with at most B vertices such that the total profit of those vertices dominated by D is as large as possible. We present the first parallel algorithms for MinGPDS and MaxGBDS in unit disk graphs. They both run in O(logn) rounds on O(n) machines, and achieve constant approximation ratios. (c) 2022 Elsevier B.V. All rights reserved.
The max-min ant system (MMAS) algorithm has found extensive application in tackling combinatorial optimization challenges such as the traveling salesman problem (TSP), production scheduling, and quadratic assignment. ...
详细信息
The max-min ant system (MMAS) algorithm has found extensive application in tackling combinatorial optimization challenges such as the traveling salesman problem (TSP), production scheduling, and quadratic assignment. Nevertheless, as the scale of the problem increases, the MMAS algorithm gradually encounters performance limitations. To address the performance constraints of MMAS, we propose a parallel max-min ant system (PMMAS) algorithm, where a master subpopulation coordinates multiple subpopulations in parallel search. Furthermore, to facilitate the parallel acceleration of computationally intensive tasks in PMMAS using the CPE array of the SW26010-Pro processor, the selection weight calculation equation in the traditional MMAS algorithm was improved. This improvement led to the introduction of the Sunway parallel max-min ant system (SWPMMAS) algorithm, which implements parallelism using MPI and Athread. The revised selection weight calculation equation is also applicable to the traditional MMAS algorithm and enhances its running speed. Finally, the SWPMMAS algorithm was evaluated using various TSP instances, with city counts ranging from 51 to 11,849. The results demonstrate that the SWPMMAS algorithm provides excellent solutions. For TSP instances with more than 10,000 cities, the SWPMMAS algorithm achieves over 13x\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times$$\end{document} speedup compared to the PMMAS algorithm running on the Sunway architecture and 5.4x\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times$$\end{document} speedup compared to the PMMAS algorithm running on a commercial Shanh
暂无评论