检索结果-内蒙古大学图书馆

Solving Time-Fractional reaction-diffusion systems through a tensor-based parallel algorithm

PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS 2023年 611卷

作者： Cardone, Angelamaria De Luca, Pasquale Galletti, Ardelio Marcellino, Livia Univ Salerno Dept Math Via Giovanni Paolo 2n 132 I-84084 Fisciano Italy Parthenope Univ Naples Dept Sci & Technol Ctr Direzionale C4 I-80143 Naples Italy Parthenope Univ Naples UNESCO Chair Environm Resources & Sustainable Dev Dept Sci & Technol Int Ph D Programme Ctr DirezionaleIsola C4 I-80143 Naples Italy

Machine Learning (ML) approach is a discussed research topic because of its benefit in several research fields. The most important issues in the training process of ML are accuracy and speed: a suitable mathematical model is critical and a fast data processing is mandatory. Fractional Calculus is involved in a large number of important applications and, recently, many ML algorithms, in order to improve accuracy of results when performing training in solving optimization problems, are based on decision and control performed by means of time-fractional models to better understand complex systems. However, the high computational cost, which characterizes the numerical solution, of this approach might be a problem for large scale Machine Learning systems. High Performance Computing (HPC) is the way of addressing the need of real time computation. In fact, through tensor-based parallel strategies designed for modern parallel architectures, Fractional Calculus tools are very helpful for the ML training step. In this contest, we consider a time-fractional diffusion system and, after introducing a suitable modification of a numerical model to solve it, we propose a related and novel parallel implementation on GPUs (Graphics Processing Units). Experiments show the gain of performance in terms of execution time and accuracy of our parallel implementation. (c) 2023 Elsevier B.V. All rights reserved.

关键词： Machine Learning Reaction-diffusion systems Numerical methods parallel algorithm GPU computing

来源：评论

学校读者我要写书评

暂无评论

An efficient parallel algorithm for 3D magnetotelluric modeling with edge-based finite element

引用

COMPUTATIONAL GEOSCIENCES 2021年第1期25卷 1-16页

作者： Zhu, Xiaoxiong Liu, Jie Cui, Yian Gong, Chunye Natl Univ Def Technol Sci & Technol Parallel & Distributed Proc Lab Changsha 410073 Peoples R China Natl Univ Def Technol Lab Software Engn Complex Syst Changsha 410073 Peoples R China Cent South Univ Sch Geosci & Infophys Changsha 410083 Peoples R China

Three-dimensional magnetotelluric modeling algorithm of high accuracy and high efficiency is required for data interpretation and inversion. In this paper, edge-based finite element method with unstructured mesh is used to solve 3D magnetotelluric problem. Two boundary conditions-Dirichlet boundary condition and Neumann boundary condition-are set for cross-validation and comparison. We propose an efficient parallel algorithm to speed up computation and improve efficiency. The algorithm is based on distributed matrix storage and has three levels of parallelism. The first two are process level parallelization for frequencies and matrix solving, and the last is thread-level parallelization for loop unrolling. The algorithm is validated by several model studies. Scalability tests have been performed on two distributed-memory HPC platforms, one consists of IntelE5-2660 microprocessors and the other consists of Phytium FT2000 Plus microprocessors. On Intel platform, computation time of our algorithm solving Dublin Test Model-1 with 3,756,373 edges at 21 frequencies is 365 s on 2520 cores. The speedup and efficiency are 1609 and 60% compared to 100 cores. On Phytium platform, scalability test shows that the speedup from 256 cores to 86,016 cores has been increased to 11,255.

关键词： Magnetotelluric Edge-based finite element Unstructured mesh parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

An inertial parallel algorithm for a finite family of G-nonexpansive mappings with application to the diffusion problem

引用

ADVANCES IN DIFFERENCE EQUATIONS 2021年第1期2021卷 1-13页

作者： Charoensawan, Phakdi Yambangwai, Damrongsak Cholamjiak, Watcharaporn Suparatulatorn, Raweerote Chiang Mai Univ Adv Res Ctr Computat Simulat Chiang Mai 50200 Thailand Chiang Mai Univ Fac Sci Dept Math Chiang Mai 50200 Thailand Univ Phayao Sch Sci Phayao 56000 Thailand

For finding a common fixed point of a finite family of G-nonexpansive mappings, we implement a new parallel algorithm based on the Ishikawa iteration process with the inertial technique. We obtain the weak convergence theorem of this algorithm in Hilbert spaces endowed with a directed graph by assuming certain control conditions. Furthermore, numerical experiments on the diffusion problem demonstrate that the proposed approach outperforms well-known approaches.

关键词： Fixed point problem Diffusion problem parallel algorithm G-nonexpansive mapping

来源：评论

学校读者我要写书评

暂无评论

A parallel algorithm for ridge-penalized estimation of the multivariate exponential family from data of mixed types

引用

STATISTICS AND COMPUTING 2021年第4期31卷 1-13页

作者： Trip, Diederik S. Laman van Wieringen, Wessel N. Delft Univ Technol Kavli Inst Nanosci Dept Bionanosci Maasweg 9 NL-2629 HZ Delft Netherlands Amsterdam UMC Dept Epidemiol & Data Sci Amsterdam Publ Hlth Res Inst Locat VUmc POB 7057 NL-1007 MB Amsterdam Netherlands Vrije Univ Amsterdam Dept Math De Boelelaan 1081a NL-1081 HV Amsterdam Netherlands

Computationally efficient evaluation of penalized estimators of multivariate exponential family distributions is sought. These distributions encompass among others Markov random fields with variates of mixed type (e.g., binary and continuous) as special case of interest. The model parameter is estimated by maximization of the pseudo-likelihood augmented with a convex penalty. The estimator is shown to be consistent. With a world of multi-core computers in mind, a computationally efficient parallel Newton-Raphson algorithm is presented for numerical evaluation of the estimator alongside conditions for its convergence. parallelization comprises the division of the parameter vector into subvectors that are estimated simultaneously and subsequently aggregated to form an estimate of the original parameter. This approach may also enable efficient numerical evaluation of other high-dimensional estimators. The performance of the proposed estimator and algorithm are evaluated and compared in a simulation study. Finally, the presented methodology is applied to data of an integrative omics study.

关键词： Markov random field Consistency Pseudo-likelihood Block-wise Newton– Raphson Network parallel algorithm Graphical model

来源：评论

学校读者我要写书评

暂无评论

Research on a parallel algorithm for Video Image Compression of Transmission Line Inspection 23

Research on a Parallel Algorithm for Video Image Compression...

引用

Proceedings of the 2023 International Conference on Big Data Mining and Information Processing

作者： Meihui Hu Kai Li Jiao Wan Tao Chen Zhiwei Xiang State Grid Xinjiang Information & Telecommunication Company China State Grid Xinjiang Electric Power Co. Ltd. China

ISBN: (纸本)9798400709166

Unmanned aerial vehicle (UAV) has the advantages of simple operation, sensitive response, flexible flight, long battery life and low cost, and has become a conventional way of power inspection. However, the video signal with huge data will bring a certain burden to the hardware of the data acquisition end of the system, so it is necessary to improve the sampling performance of the data acquisition end of the video compression system. In this paper, a parallel algorithm for video image compression of power transmission line inspection is proposed. By using the message passing interface function provided by MPI (message passing interface), the search and matching process of image domain block and value domain block is distributed to multiple processors for simultaneous execution. The experimental results show that when only one computing node is used, the CPU utilization efficiency is very close when the images with the same compression ratio are decompressed in two parallel modes. With the increase of the number of computing nodes, the efficiency of MPI parallel mode decreases gradually, while the efficiency of MPI+Open MP hybrid model increases. This study has certain reference value and practical value for real-time processing of transmission line inspection data.

关键词： parallel algorithm Transmission line inspection Video image compression

来源：评论

学校读者我要写书评

暂无评论

A parallel algorithm to Construct Node-Independent Spanning Trees on the Line Graph of Locally Twisted Cube 12

A Parallel Algorithm to Construct Node-Independent Spanning ...

引用

12th International Symposium on parallel Architectures, algorithms and Programming (PAAP)

作者： Pan, Zhiyong Cheng, Baolei Fan, Jianxi Zhang, Huanwen Soochow Univ Sch Comp Sci & Technol Suzhou Peoples R China

ISBN: (纸本)9781665496391

An interconnection network can be abstracted into a graph, and the basic mathematical research in the graph can provide a good reference for the research in the practical application. The study of node-independent spanning trees (node-ISTs) in a graph has received extensive attention because of their application in reliable communication, fault-tolerant broadcasting and secure message distribution, and has achieved remarkable results on many special networks. But there are few results in the line graph of them. As one of the typical variations of hypercube, locally twisted cube has many excellent properties, whose line graph has all the advantages of locally twisted cube. So it makes sense to do some research on the line graph of locally twisted cube. In this paper, we propose a parallel algorithm to construct 2n -2 node-ISTs rooted at node [u,N(u, 2)], where u is an arbitrary node on locally twisted cube and n >= 1. And the correctness of our algorithm is proved.

关键词： Locally twisted cube Line graph parallel algorithm Node-independent spanning trees

来源：评论

学校读者我要写书评

暂无评论

A load balancing parallel algorithm for solving large-scale tridiagonal linear systems

A load balancing parallel algorithm for solving large-scale ...

引用

International Conference on algorithms, High Performance Computing, and Artificial Intelligence (AHPCAI)

作者： Tian, Min Qiao, Shan Wang, Junjie Du, Wei Qilu Univ Technol Shandong Acad Sci Natl Supercomp Ctr Jinan Shandong Comp Sci Ctr Jinan Peoples R China Qilu Univ Technol Shandong Acad Sci Sch Math & Stat Jinan Peoples R China

ISBN: (数字)9781510651890

ISBN: (纸本)9781510651890;9781510651883

Solving large-scale sparse linear systems is a critical problem in scientific and engineering computing. Partial differential equations can solve problems in many fields. They can be transformed into large-scale linear systems with a series of methods, and the parallel solution of tridiagonal linear systems is one of them. The solution of linear systems is very time-consuming in most of the problems, accounting for more than half of the total time. Load balancing can reduce process time for waiting and improves computational efficiency, and it is the focus of many algorithms. The article is based on Stone's proposed recursive doubling algorithm, an improved algorithm for solving tridiagonal linear systems using the full-recursive-doubling communication model and the Mobiu transform. The improved algorithm can calculate the million-dimensional linear systems. Numerical experiments show that compared with ordinary parallel algorithms, the improved algorithm shows up to 2x improvement than the original version, and some results even show up to 3x. In addition, the load-balancing performance has been greatly improved, and the time difference of the processes is 1/7 of the original version. The improved algorithm has a good load balancing, and the running time of each process is not much different, avoiding process waiting and resource wastage.

关键词： tridiagonal linear systems parallel algorithm load balancing recursive doubling Mobius transformation

来源：评论

学校读者我要写书评

暂无评论

A parallel algorithm to Compute the Transport of Suspended Particles Based on 3D Models 15th

A Parallel Algorithm to Compute the Transport of Suspended P...

引用

15th International Scientific Conference on parallel Computational Technologies (PCT)

作者： Atayan, Asya M. Kuznetsova, Inna Yu Don State Tech Univ Rostov Na Donu Russia Sirius Sci & Technol Univ Soci Russia Southern Fed Univ Rostov Na Donu Russia

ISBN: (纸本)9783030816919;9783030816902

We consider 2D and 3D models dealing with the transport of suspended particles. The approximation of 2D and 3D models that describe the transport of suspended particles is considered on the example of the two-dimensional diffusion-convection equation. We use discrete analogs of convective and diffusion transfer operators on the assumption of partial filling of cells. The geometry of the computational domain is described based on the filling function. We solve the problem of transport of suspended particles using a difference scheme that is a linear combination of the Upwind and the Standard Leapfrog difference schemes with weight coefficients obtained from the condition of minimization of the approximation error. The scheme is designed to solve the problem of transfer of impurities for large grid Peclet numbers. We have developed some parallel algorithms for the solution of this problem on multiprocessor systems with distributed memory. The results of numerical experiments give us grounds to draw conclusions about the advantages of 3D models of transport of suspended particles over 2D ones.

关键词： Model of transport of suspended particles Upwind Leapfrog scheme Partial filling of cells Three-dimensional model Convection-diffusion equation parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

Design and Optimization of parallel algorithm for Kalman Filter on SW26010 Many-Core Processors

引用

JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS 2022年第4期31卷

作者： Yang, Aiqiang Hunan Agr Univ Coll Informat & Intelligence Changsha 410128 Hunan Peoples R China

Kalman filter algorithm, an effective data processing algorithm, has been widely used in space monitoring, wireless communications, tracking systems, financial industry, big data and so on. On Sunway TaihuLight platform, we present an optimized Kalman filter parallel algorithm which is according to new architecture of the SW26010 many-core processors (260 cores) and new programming mode (master and slave heterogeneous collaboration mode). Furthermore, we propose a pipelined parallel mode for Kalman filter algorithm based on seven-level pipeline of SW26010 processor. The vector optimization strategy and double buffering mechanisms are provided to improve parallel efficiency of Kalman filter parallel algorithm on SW26010 processors. The vector optimization strategy can improve data concurrency in parallel computing. In addition, the communication time can be hidden by double buffering mechanisms of SW26010 processors. The experimental results show that the performance and scalability of the parallel Kalman filter algorithm based on SW26010 are greatly improved compared with the CPU algorithm for five data sets, and is also improved compared to the algorithm on GPU.

关键词： Heterogeneous computing Kalman Filter parallel algorithm many-core processors

来源：评论

学校读者我要写书评

暂无评论

A Scalable parallel algorithm for 3-D Magnetotelluric Finite Element Modeling in Anisotropic Media

引用

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING 2022年 60卷

作者： Zhu, Xiaoxiong Liu, Jie Cui, Yian Gong, Chunye Natl Univ Def Technol Sci & Technol Parallel & Distributed Proc Lab Lab Software Engn Complex Syst Changsha 410073 Peoples R China Cent South Univ Sch Geosci & Info Phys Changsha 410083 Peoples R China

3-D magnetotelluric (MT) forward modeling has always been faced with the problems of high memory requirements and long computing time. In this article, we design a scalable parallel algorithm for 3-D MT finite element modeling in anisotropic media. The parallel algorithm is based on the distributed mesh storage, including multiple parallel granularities, and is implemented through multiple tools. Message-passing interface (MPI) is used to exploit process parallelisms for subdomains, frequencies, and solving equations. Thread parallelisms for merge sorting, element analysis, matrix assembly, and imposing Dirichlet boundary conditions are developed by Open Multi-Processing (OpenMP). We validate the algorithm through several model simulations and study the effects of topography and conductivity anisotropy on apparent resistivities and phase responses. Scalability tests are performed on the Tianhe-2 supercomputer to analyze the parallel performance of different parallel granularities. Three parallel direct solvers Supernodal LU (SUPERLU), MUltifrontal Massively parallel sparse direct Solver (MUMPS), and parallel Sparse matriX package (PASTIX) are compared in solving sparse systems of equations. As a result, reasonable parallel parameters are suggested for practical applications. The developed parallel algorithm is proven to be efficient and scalable.

关键词： Finite element analysis Mathematical model parallel algorithms Memory management Sparse matrices Conductivity Computational modeling Conductivity anisotropy finite element method (FEM) magnetotelluric (MT) parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：