检索结果-内蒙古大学图书馆

parallel adaptive large neighborhood search based on spark to solve VRPTW

SCIENTIFIC REPORTS 2024年第1期14卷 1-14页

作者： Liu, Songzuo Sun, Jian Duan, Xiaohong Liu, Guofang Shandong Huayu Univ Technol Fac Informat Engn Dezhou 253034 Peoples R China North Minzu Univ Coll Comp Sci & Engn Yinchuan 750021 Peoples R China

Aiming at the multi-objective vehicle path planning problem with time windows (VRPTW), a Spark-based parallel Adaptive Large Neighborhood Search algorithm (Spark-ALNS) is proposed to solve it. The main design of the 4-point strategy: (1) Design a new simulated annealing algorithm cooling strategy to achieve a better jump out of the local optimal solution. (2) Adopt CW initialization to accelerate the convergence speed. (3) Use three destruction operators and three repair operators to implement local path optimization. (4) A new parallel strategy is proposed to improve the algorithm's accuracy and reduce the running time. To illustrate the algorithm's effectiveness, the arithmetic example in Solomon is used as an example. The experimental results show that the proposed Spark-ALNS can find better solutions, get the known optimal solutions for 41 out of 56 instances, and find new optimal solutions for 31 algorithms, which outperforms other evolutionary algorithms. The runtime is 3-5 times better than other parallel algorithms and is able to solve VRPTW effectively.

关键词： Adaptive large neighborhood search algorithm Cooling strategy parallel algorithm Spark VRPTW

来源：评论

学校读者我要写书评

暂无评论

The parallel algorithm for Solving Toeplitz Systems

引用

JOURNAL OF INTERDISCIPLINARY MATHEMATICS 2015年第4期18卷 449-457页

作者： Liu, Chengzhi Hunan Univ Humanities Sci & Technol Dept Math Loudi 417000 Peoples R China

This paper studies a parallel algorithm for real Toeplitz systems, which is proposed based on the block Jacobi iteration and GMRES method. The algorithm has the advantage of less float operations, fast convergence speed and especially suitable for parallel computating. In this paper, we first use the block Jacobi iterative method to obtain the iterative process, and then the GMRES method is nested to obtain the iterative sequences {x(k)}. Therefore, the parallel algorithm for solving symmetric positive definite Toeplitz systems is constructed. The convergence of the algorithm is also discussed simply in the paper. At the end, we give some numerical examples to illustrate the effectiveness of the parallel algorithm.

关键词： Toeplitz matrix block Jacobi iteration generalized minimum residual method parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

The Pre-processing parallel algorithm of A Sparse Linear Equation Group

引用

International English Education Research 2015年第1期 96-98页

作者： Cao Ying Qinhuangdao branch school Northeast Petroleum University Qinhuangdao Hebei 066004

The solution of linear equation group can be applied to the oil exploration, the structure vibration analysis, the computational fluid dynamics, and other fields. When we make the in-depth analysis of some large or very large complicated structures, we must use the parallel algorithm with the aid of high-performance computers to solve complex problems. This paper introduces the implementation process having the parallel with sparse linear equations from the perspective of sparse linear equation group.

关键词： Sparse Linear Equations Pre-processing parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

Space-efficient computation of parallel approximate string matching

引用

JOURNAL OF SUPERCOMPUTING 2023年第8期79卷 9093-9126页

作者： Sadiq, Muhammad Umair Yousaf, Muhammad Murtaza Govt Coll Univ Dept Comp Sci Lahore 54000 Pakistan Univ Punjab Fac Comp & Informat Technol Lahore 54000 Pakistan Univ Punjab Fac Comp & Informat Technol Dept Software Engn Lahore 54000 Pakistan

Approximate string matching (ASM) has a number of applications in many disciplines, ranging from information retrieval to gene matching. Conventional solution to this problem is based on the dynamic programming-based strategy having quadratic space and time complexity. The complexity of the conventional solution makes it impractical to search queries from the huge sequences having billions of characters. Therefore, many studies have been proposed that improves on the space and time requirement of the basic solution which includes heuristic, filtration, and index-based solutions. These existing solutions obtain the better performance by compromising on the completeness of the search. In this paper, we proposed the linear space algorithm for the approximate string matching problem while retaining the time complexity of conventional solution. The proposed method works in linear space without omitting any regions in the given text;hence, it finds all the possible matches. Conventional dynamic programming solution is modified in such a way that storage of complete trace back table is avoided by keeping only running count of each edit operation in the memory. A variety of laws and facts are discovered in classical dynamic programming table in that regard. We also presented the parallel approach to the proffered algorithm to improve the running time of the algorithm. The algorithm is evaluated on the CUDA-enabled GPUs. DNA sequences of sizes between 250 and 970 MBP are used for evaluation. Moreover, experiments are also performed by using natural language text to highlight the broader applicability of the proposed algorithm. Results show the substantial superiority of the algorithm in terms of performance and scalability compared to the state-of-the-art algorithms.

关键词： Approximate string matching Dynamic programming parallel algorithm Performance evaluation GPUs OpenMP

来源：评论

学校读者我要写书评

暂无评论

GPU Based High Definition parallel Video Codec Optimization in Mobile Device

引用

IEEE TRANSACTIONS ON MOBILE COMPUTING 2023年第6期22卷 3333-3349页

作者： Su, Baichuan Cheng, Bo Chen, Junliang Beijing Univ Posts & Telecommun State Key Lab Networking & Switching Technol Beijing 100876 Peoples R China

With the explosive growth of various intelligent device and the rapid development of wireless network communication technology, most people prefer to use video applications on smart devices. However, the main challenges when using video codec technology on mobile devices are: 1) The explosive growth of multimedia applications has caused the allocation of computing resources to become an important issue;2) high power consumption and limited battery power;3) high cpu utilization causes the system to be unresponsive. In this paper, a GPU based High Definition parallel Video Codec (GHPVC) is proposed, which is a low energy consumption and high efficient video codec on mobile devices. First, Frame Data Management model and Prediction Model Selector model are proposed in order to get higher data transmission efficiency and parallel execution efficiency. Second, a GPU based parallel ME module is proposed because the ME module is the most power-consuming and computationally intensive module in video codec. The GHPVC is proposed on the basis of conforming to the H.264 standard. Moreover and experimentally evaluated for different GPU devices on different mobile devices. Experimental results show that compared with the existing H.264 scheme, the proposed GHPVC not only has significant improvement in codec performance, but also effectively reduces energy consumption and CPU utilization.

关键词： Graphics processing unit video codec OpenCL parallel algorithm heterogeneous platform

来源：评论

学校读者我要写书评

暂无评论

parallel Differential Evolution algorithm Accelerated by Graphics Processing Unit for Harmonic Minimization in Power Converters 13

Parallel Differential Evolution Algorithm Accelerated by Gra...

引用

13th IEEE Energy Conversion Congress and Exposition (IEEE ECCE)

作者： Ren, Kaiqi He, Fei Li, Zhaoyuan Yang, Kehu China Univ Min & Technol Sch Mech Elect & Infoimat Engn Beijing Peoples R China

ISBN: (纸本)9781728151359

Intelligent optimization algorithms, such as the genetic algorithm (GA) and particle swarm optimization (PSO), have been widely used for harmonic minimization in power converters. However, these algorithms usually evaluate the fitness function for hundreds of or even more populations, which leads to a huge computing burden and memory consumption. Hence, in practical applications, it is difficult to real-time implement these algorithms on the traditional central processing units (CPUs). On the other hand, although the population size is huge, each of them performs the same calculations independently, which is very suitable for parallelization on the graphical processing units (GPUs). In this paper, a parallel version of differential evolution (DE) on GPUs is proposed. Compared to the traditional CPU-based DE algorithm, the GPU-based parallel DE algorithm executes hundreds of times faster in solving the harmonic minimization problem. Some computational results of 21 switching angles for three-level inverters are given. Also, some guidelines for the parameter selection, such as the population size, the grid allocation, etc., on algorithm parallelization are discussed and summarized.

关键词： Differential evolution (DE) graphical processing unit (GPU) multilevel inverter parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

Efficient parallel algorithms for Computing Percolation Centrality 28

Efficient Parallel Algorithms for Computing Percolation Cent...

引用

28th Annual IEEE International Conference on High Performance Computing, Data, and Analytics (HiPC)

作者： Chandramouli, Athreya Jana, Sayantan Kothapalli, Kishore Int Inst Informat Technol Ctr Secur Theory & Algorithm Res Hyderabad 500032 India

ISBN: (纸本)9781665410168

Centrality measures on graphs have found applications in a large number of domains including modeling the spread of an infection/disease, social network analysis, and transportation networks. As a result, parallel algorithms for computing various centrality metrics on graphs are gaining significant research attention in recent years. In this paper, we study parallel algorithms for the percolation centrality measure which extends the betweenness-centrality measure by incorporating a time dependent state variable with every node. We present parallel algorithms that compute the source-based and source-destination variants of the percolation centrality values of nodes in a network. Our algorithms extend the algorithm of Brandes, introduce optimizations aimed at exploiting the structural properties of graphs, and extend the algorithmic techniques introduced by Sariyuce el al. [26] in the context of centrality computation. Experimental studies of our algorithms on an Intel Xeon(R) Silver 4116 CPU and an Nvidia Tesla V100 GPU on a collection of 12 real-world graphs indicate that our algorithmic techniques offer a significant speedup.

关键词： articulation vertex backward pass betweenness centrality betweenness centrality value biconnected component brande algorithm centrality measure centrality value destination percolation centrality gpu vs cpu nvidia tesla v100 parallel algorithm Percolation centrality percolation centrality measure percolation centrality score percolation centrality value percolation state scaling term shortest path sided dependency social network source based percolation centrality source based version source destination source destination based percolation source destination percolation source destination version run time suffix sum update rule

来源：评论

学校读者我要写书评

暂无评论

A parallel closed centrality algorithm for complex networks 2

A parallel closed centrality algorithm for complex networks

引用

2nd International Informatics and Software Engineering Conference (IISEC) - Artificial Intelligence for Digital Transformation

作者： Ereiyes, Kayhan Maltepe Univ Software Engn Dept Istanbul Turkey

ISBN: (纸本)9781665407595

Complex networks are large and analysis of these networks require significantly different methods than small networks. parallel processing is needed to provide analysis of these networks in a timely manner. Graph centrality measures provide convenient methods to assess the structure of these networks. We review main centrality algorithms, describe implementation of closed centrality in Python and propose a simple parallel algorithm of closed centrality and show its implementation in Python with obtained results.

关键词： complex network closeness centrality betweenness centrality parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

PI-sqrt: novel parallel implementations of in-place sequence rotation on multicore systems

引用

CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS 2023年第1期26卷 539-557页

作者： Hashem, Mervat Li, Kenli Salah, Ahmad Hunan Univ Coll Informat Sci & Engn Changsha Peoples R China Zagazig Univ Fac Comp & Informat Zagazig Egypt Univ Technol & Appl Sci Dept Informat Technol CAS Ibri Muscat Oman

The huge data volumes and the emergence of new parallel architectures, e.g. multicore CPUs lead to revisiting classic computer science topics such as in-place sequence rotation. In-place sequence rotation is a basic step in several fundamental computing tasks. The sequential algorithms of the in-place sequence rotation effect are classic and well-studied, which are classified into three classes. Recently, Intel introduced the parallel standard template library (STL) implementation for multicore CPU systems;it has an in-place rotation function based on the rotation by copy, but its space complexity is O(n). In this work, we propose the blend rotation, which is a parallel-friendly and in-place algorithm that combines the merits of these three rotation algorithm classes. Besides, we propose a set of for parallel In-place SeQuence RoTation (PI-sqrt) implementations. The performance of PI-sqrt is examined through several experiments. To the best of our knowledge, the obtained running times show that the implementations of blend and reversal rotations are by far the fastest parallel implementations;they are faster on average, through different experiments, by 7.85 x and 5.52x, respectively, compared to the parallel rotation function of Intel parallel STL.

关键词： parallel algorithm In-place algorithm Sequence rotation Multicore CPU PI-sqrt

来源：评论

学校读者我要写书评

暂无评论

A parallel finite element method based on fully overlapping domain decomposition for the steady-state Smagorinsky model

引用

COMPUTERS & MATHEMATICS WITH APPLICATIONS 2023年第1期147卷 76-91页

作者： Zheng, Bo Shang, Yueqiang Southwest Univ Sch Math & Stat Chongqing 400715 Peoples R China

An efficient parallel finite element method is introduced for solving the steady-state Smagorinsky model in which a fully overlapping domain decomposition is considered for parallelization. The crucial idea of the method is to utilize a locally refined multiscale mesh that is fine around its own subdomain and coarse elsewhere to calculate a local finite element solution. On the basis of an existing Smagorinsky solver, the introduced method is easily implemented and avoids massive recoding. Using the duality argument, errors of the standard finite element approximations for the velocity in ��2 norm and pressure in ��-1 norm are derived. Error bounds of the solutions from the introduced method are estimated. Moreover, four parallel iterative algorithms are presented, and some results of numerical tests are given to verify the theory predicted and demonstrate the effectiveness of the algorithms. It is numerically shown that the parallel algorithms decrease substantially the CPU time, keeping the accuracy of the solutions comparable to the serial algorithm.

关键词： Smagorinsky model Finite element parallel algorithm Domain decomposition Error estimate

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：