检索结果-内蒙古大学图书馆

parallel strategies for Direct Multisearch

NUMERICAL algorithms 2023年第3期92卷 1757-1788页

作者： Tavares, S. Bras, C. P. Custodio, A. L. Duarte, V Medeiros, P. FCT NOVA Campus Caparica P-2829516 Caparica Portugal FCT NOVA CMA Dept Math Campus Caparica P-2829516 Caparica Portugal FCT NOVA NOVA LINCS Dept Comp Sci Campus Caparica P-2829516 Caparica Portugal

Direct multisearch (DMS) is a derivative-free optimization class of algorithms, suited for computing approximations to the complete Pareto front of a given multiobjective optimization problem. In DMS class, constraints are addressed with an extreme barrier approach, only evaluating feasible points. It has a well-supported convergence analysis and simple implementations present a good numerical performance, both in academic test sets and in real applications. Recently, this numerical performance was improved with the definition of a search step based on the minimization of quadratic polynomial models, corresponding to the algorithm BoostDMS. In this work, we propose and numerically evaluate strategies to improve the performance of BoostDMS, mainly through parallelization applied to the search and to the poll steps. The final parallelized version not only considerably decreases the computational time required for solving a multiobjective optimization problem, but also increases the quality of the computed approximation to the Pareto front. Extensive numerical results will be reported in an academic test set and in a chemical engineering application.

关键词： Multiobjective optimization Derivative-free optimization Direct search methods parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Local and parallel stabilized finite element methods based on two-grid discretizations for the Stokes equations

引用

NUMERICAL algorithms 2023年第1期93卷 67-83页

作者： Wang, Xinhui Du, Guangzhi Shandong Normal Univ Sch Math & Stat Jinan 250014 Peoples R China

Based on two-grid discretizations, some local and parallel stabilized finite element methods are proposed and investigated for the Stokes problem in this paper. For the finite element discretization, the lowest equal-order finite element pairs are chosen to circumvent the discrete inf-sup condition. In these algorithms, we derive the low-frequency components of the solution for the Stokes problem on a coarse grid and catch the high-frequency components on a fine grid using some local and parallel procedures. Optimal error bounds are demonstrated and some numerical experiments are carried out to support theoretical results.

关键词： Stokes equations Stabilized finite element method Two-grid discretizations parallel algorithms Partition of unity

来源：评论

学校读者我要写书评

暂无评论

Exploiting Julia for parallel RBF-Based 3D Surface Reconstruction: A First Experience 32

Exploiting Julia for Parallel RBF-Based 3D Surface Reconstru...

引用

32nd Euromicro International Conference on parallel, Distributed and Network-Based Processing (PDP)

作者： De Luca, Pasquale Galletti, Ardelio Marcellino, Livia Pianese, Mario Parthenope Univ Naples Sci & Technol Dept Naples Italy

ISBN: (纸本)9798350363074;9798350363081

The 3D surface reconstruction is critical for various applications, demanding efficient computational approaches. Traditional Radial Basis Functions (RBFs) methods are limited by increasing data points, leading to slower execution times. Addressing this, our study introduces an experimental parallelization effort using Julia, as well-known for high-performance scientific computing. We developed an initial sequential RBF algorithm in Julia, then expanded it to a parallel model, exploiting Multi-Threading to enhance execution speed while maintaining accuracy. This initial exploration into Julia's parallel computing capabilities shows marked performance gains in 3D surface reconstruction, offering promising directions for future research. Our findings affirm Julia's potential in computationally intensive tasks, with test results confirming the expected time efficiency improvements.

关键词： surface reconstruction radial basis functions parallel algorithms Julia programming

来源：评论

学校读者我要写书评

暂无评论

parallel-in-Time Power System Simulation Using a Differential Transformation Based Adaptive Parareal Method

IEEE OPEN ACCESS JOURNAL OF POWER AND ENERGY

引用

IEEE OPEN ACCESS JOURNAL OF POWER AND ENERGY 2023年 10卷 61-72页

作者： Liu, Yang Park, Byungkwon Sun, Kai Dimitrovski, Aleksandar Simunovic, Srdjan Univ Tennessee Dept EECS Knoxville TN 37996 USA Soongsil Univ Dept Elect Engn Seoul 06978 South Korea Univ Cent Florida Dept ECE Orlando FL 32816 USA Oak Ridge Natl Lab Computat Sci & Engn Div Oak Ridge TN 37830 USA

For parallel-in-time simulation of large-scale power systems, this paper proposes a differential transformation based adaptive Parareal method for significantly improved convergence and time performance compared to a traditional Parareal method, which iterates a sequential, numerical coarse solution over extended time steps to connect parallel fine solutions within respective time steps. The new method employs the differential transformation to derive a semi-analytical coarse solution of power system differential-algebraic equations, by which the order and time step, as well as the window length with a multi-window solution strategy, can adaptively vary with the response of the system. Thus, the new method can reduce divergences and also speed up the overall simulation. Extensive tests on the IEEE 39-bus system and the Polish 2383-bus system have verified the performance of the proposed method.

关键词： Computational modeling Mathematical models Adaptation models Power system stability Convergence Numerical models Power system dynamics Differential transformation Parareal parallel algorithms power system simulation power system stability

来源：评论

学校读者我要写书评

暂无评论

Performance Optimization of parallel algorithms

引用

JOURNAL OF COMMUNICATIONS AND NETWORKS 2014年第4期16卷 436-446页

作者： Hudik, Martin Hodon, Michal Univ Zilina Dept Tech Cybernet Zilina 01026 Slovakia

The high intensity of research and modeling in fields of mathematics, physics, biology and chemistry requires new computing resources. For the big computational complexity of such tasks computing time is large and costly. The most efficient way to increase efficiency is to adopt parallel principles. Purpose of this paper is to present the issue of parallel computing with emphasis on the analysis of parallel systems, the impact of communication delays on their efficiency and on overall execution time. Paper focuses is on finite algorithms for solving systems of linear equations, namely the matrix manipulation (Gauss elimination method, GEM). algorithms are designed for architectures with shared memory (open multiprocessing, openMP), distributed-memory (message passing interface, MPI) and for their combination (MPI + openMP). The properties of the algorithms were analytically determined and they were experimentally verified. The conclusions are drawn for theory and practice.

关键词： Collective communication operations efficiency Gauss elimination method modeling parallel algorithms parallel architecture parallel computation performance prediction pipelined broadcast system of linear equations

来源：评论

学校读者我要写书评

暂无评论

Work-Efficient parallel Derandomization II: Optimal Concentrations via Bootstrapping 2024

Work-Efficient Parallel Derandomization II: Optimal Concentr...

引用

56th Annual ACM Symposium on Theory of Computing (STOC)

作者： Ghaffari, Mohsen Grunau, Christoph MIT Cambridge MA 02139 USA Swiss Fed Inst Technol Zurich Switzerland

ISBN: (纸本)9798400703836

In this paper, we present an efficient parallel derandomization method for randomized algorithms that rely on concentrations such as the Chernoff bound. This settles a classic problem in parallel derandomization, which dates back to the 1980s. Concretely, consider the set balancing problem where m sets of size at most B are given in a ground set of size n, and we should partition the ground set into two parts such that each set is split evenly up to a small additive (discrepancy) bound. A random partition achieves a discrepancy of O (root s log m) in each set, by Chernoff bound. We give a deterministic parallel algorithm that matches this bound, using near-linear work (O) over tilde (m + n + Sigma(m)(i=1) vertical bar S-i vertical bar and polylogarithmic depth poly(log (mn)). The previous results were weaker in discrepancy and/or work bounds: Motwani, Naor, and Naor [FOCS'89] and Berger and Rompel [FOCS'89] achieve discrepancy BY center dot $ (p B log <) with work <(O)over tilde> (m + n + Sigma(m)(i=1) vertical bar S-i vertical bar)center dot m(Theta(1/epsilon)) and polylogarithmic depth;the discrepancy was optimized to O (root s log m) in later work, e.g. by Harris [Algorithmica'19], but the work bound remained prohibitively high at (O) over tilde (m(4)n(3)). Notice that these would require a large polynomial number of processors to even match the near-linear runtime of the sequential algorithm. Ghaffari, Grunau, and Rozhon [FOCS'23] achieve discrepancy s/poly(log(nm)) + O(root s log m) with near-linear work and polylogarithmic-depth. Notice that this discrepancy is nearly quadratically larger than the desired bound and barely sublinear with respect to the trivial bound of s. Our method is different from prior work. It can be viewed as a novel bootstrapping mechanism that uses crude partitioning algorithms as a subroutine and sharpens their discrepancy to the optimal bound. In particular, we solve the problem recursively, by using the crude partition in each iterat

关键词： parallel algorithms Derandomization

来源：评论

学校读者我要写书评

暂无评论

parallel Social Spider Optimization algorithms with Island Model for the Clustering Problem 7th

Parallel Social Spider Optimization Algorithms with Island M...

引用

7th Annual International Conference on Information Management and Big Data, SIMBig 2020

作者： Alvarez-Mamani, Edwin Enciso-Rodas, Lauro Ayala-Rincón, Mauricio Soncco-Álvarez, José L. Department of Informatics Universidad Nacional de San Antonio Abad del Cusco Cusco Peru Departments of Computer Science and Mathematics Universidade de Brasília Brasília D.F. Brazil

ISBN: (纸本)9783030762278

The digital age came with an extraordinary ability to generate data across organizations, people, and devices, data that needs to be analyzed, processed and stored. A well-known technique for analyzing this kind of data is Clustering. Many bio-inspired algorithms were proposed for this problem such as the Social Spider Optimization (SSO). In this work, we propose parallel island models of the SSO algorithm for the Clustering problem, using 24 processors for each parallel algorithm. Such models were implemented using static and dynamic topologies, and datasets from the UCI Machine Learning Repository used for the stage of experiments. The achieved average speedups range from 15 to 28 times faster than the SSO algorithm for large and small datasets, respectively, and a parallel model with static ring topology performs a little bit faster than the other parallel models. The parallel algorithms provide results with similar precision to the ones computed with the SSO algorithm. © 2021, Springer Nature Switzerland AG.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

ALZI: An Improved parallel Algorithm for Finding Connected Components in Large Graphs 30th

ALZI: An Improved Parallel Algorithm for Finding Connected C...

引用

30th European Conference on parallel and Distributed Processing (Euro-Par)

作者： Boddu, Sharon Khan, Maleq Texas A&M Univ Dept Elect Engn & Comp Sci Kingsville TX 78363 USA

ISBN: (纸本)9783031695827;9783031695834

Finding connected components is a fundamental problem in graph and network analysis. It also serves as a subroutine in other graph problems. There are efficient sequential algorithms for finding connected components in a graph. However, a sequential algorithm can take a long time for a large graph. parallel algorithms can significantly speed up computation using multiple processors. This paper presents a fast shared-memory parallel algorithm named ALZI (Afforest with LinkJump and Zero Implant) to find connected components in a graph. ALZI is an improvement of a recent state-of-the-art parallel algorithm called Afforest. We propose a few non-trivial optimizations that result in better performance in terms of runtime and scalability. We performed rigorous experimentation using a wide variety of real-world and artificial graphs to evaluate the performance of ALZI. The experimental results show that ALZI is 1.4-2.3 times faster than Afforest on these graphs and provides better scalability than Afforest. ALZI has the ability to work with very large graphs. On a Kronecker graph with 4.2 billion edges, ALZI can find the connected components in just 1.02 s using 128 processors.

关键词： Connect components Graph algorithms parallel algorithms Shared-memory systems

来源：评论

学校读者我要写书评

暂无评论

QR-PULP: Streamlining QR Decomposition for RISC-V parallel Ultra-Low-Power Platforms 21

QR-PULP: Streamlining QR Decomposition for RISC-V Parallel U...

引用

21st ACM International Conference on Computing Frontiers (CF)

作者： Kiamarzi, Amirhossein Rossi, Davide Tagliavini, Giuseppe Univ Bologna Bologna Italy

ISBN: (纸本)9798400705977

QR decomposition is a numerical method used in many applications from the High-Performance Computing (HPC) domain to embedded systems. This broad spectrum of applications has drawn academic and commercial attention to developing many software libraries and domain-specific hardware solutions. In the Internet of Things (IoT) domain, multicore parallel Ultra-Low-Power (PULP) architectures are emerging as energy-efficient alternatives, outperforming conventional single-core devices by coupling parallel processing with near-threshold computing. To the best of the authors' knowledge, our study introduces the first parallelized and optimized implementation of three distinct QR decomposition methods (Givens rotations, Gram-Schmidt process, and Householder transformation) on GAP-9, a commercial embodiment of the PULP architecture. parallel execution on the 8-core cluster leads to a reduction in the total number of cycles by 241% for Givens rotations, 470% for Gram-Schmidt, and 567% for Householder, compared to the GAP9 1-core scenario. while each of them only consumes 0.013 mJ, 0.012 mJ, and 0.216 mJ, respectively. Compared to traditional single-core architectures based on ARM architectures, we achieve 8x, 24x, and 30x better performance and 36x, 35x, and 30x better energy efficiency, paving the way for broad adoption of complex linear algebra tasks in the IoT domain.

关键词： QR decomposition parallel algorithms ultra-low-power computing

来源：评论

学校读者我要写书评

暂无评论

Brief Announcement: PASGAL: parallel And Scalable Graph Algorithm Library 24

Brief Announcement: PASGAL: Parallel And Scalable Graph Algo...

引用

36th ACM Symposium on parallelism in algorithms and Architectures (SPAA)

作者： Dong, Xiaojun Gu, Yan Sun, Yihan Wang, Letong UC Riverside Riverside CA 92521 USA

ISBN: (纸本)9798400704161

We introduce PASGAL (parallel And Scalable Graph Algorithm Library), a parallel graph library that scales to a variety of graph types, many processors, and large graphs. One special focus of PASGAL is the efficiency on large-diameter graphs, which is a common challenge for many existing parallel graph processing systems due to the high overhead in synchronizing threads when traversing the graph in the breadth-first order. The core idea in PASGAL is a technique called vertical granularity control (VGC) to hide synchronization overhead by careful algorithm redesign and new data structures. We compare PASGAL with existing parallel implementations on several fundamental graph problems. PASGAL is always competitive on small-diameter graphs, and is significantly faster on large-diameter graphs.

关键词： parallel algorithms Graph algorithms Graph Processing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：