检索结果-内蒙古大学图书馆

A two grid partition of unity finite element method for electrically conducting incompressible fluid flows

COMPUTERS & MATHEMATICS WITH APPLICATIONS 2024年 156卷 121-128页

作者： Du, Guangzhi Shandong Normal Univ Sch Math & Stat Jinan 250014 Peoples R China

In this article, a two grid partition of unity finite element method is proposed and investigated for electrically conducting incompressible fluid flows. This algorithm involves solving a much smaller nonlinear problem on a coarse grid, utilizing a partition of unity to decompose reasonably the residual problem into a series of independent subproblems on a fine grid, and carrying out a further coarse correction on the coarse grid. Rigorously theoretical analysis is presented and convergence results indicate that the method could reach the optimal convergence orders with proper configurations between the coarse mesh size H and the fine mesh size h. Finally, some numerical results are reported to verify our theoretical findings.

关键词： Magnetohydrodynamic flows Two-grid discretizations parallel algorithms Partition of unity

来源：评论

学校读者我要写书评

暂无评论

parallel random block-coordinate forward-backward algorithm: a unified convergence analysis

引用

MATHEMATICAL PROGRAMMING 2022年第1期193卷 225-269页

作者： Salzo, Saverio Villa, Silvia Ist Italiano Tecnol Via Melen 83 I-16152 Genoa Italy Univ Genoa DIMA Via Dodecaneso 35 I-16146 Genoa Italy

We study the block-coordinate forward-backward algorithm in which the blocks are updated in a random and possibly parallel manner, according to arbitrary probabilities. The algorithm allows different stepsizes along the block-coordinates to fully exploit the smoothness properties of the objective function. In the convex case and in an infinite dimensional setting, we establish almost sure weak convergence of the iterates and the asymptotic rate o(1/n) for the mean of the function values. We derive linear rates under strong convexity and error bound conditions. Our analysis is based on an abstract convergence principle for stochastic descent algorithms which allows to extend and simplify existing results.

关键词： Convex optimization parallel algorithms Random block-coordinate descent Arbitrary sampling Error bounds Stochastic quasi-Fejé r sequences Forward– backward algorithm Convergence rates

来源：评论

学校读者我要写书评

暂无评论

Research on parallel algorithm of high-power microwave devices simulation based on MPI-3

引用

AIP ADVANCES 2022年第7期12卷 1-8页

作者： Hu, Yulan Liu, Dagang Liu, Laqun Wang, Huihui Li, Qiang Univ Elect Sci & Technol China Sch Elect Sci & Engn Chengdu 610054 Peoples R China

Simulation of high-power microwave source devices generally uses parallel algorithms to speed up the operation. In recent years, with the upgrade of parallel technology, the parallel efficiency of the particle simulation software has been further improved. Existing MPI-2 parallel technology of particle simulation software CHIPIC realizes the access to the local memory space of other processes through message passing. The new version of the MPI-3 standard provides the shared memory feature, which allows the data to be directly called by each process in the shared memory window, which reduces the information transmission. In this paper, based on the shared memory feature of MPI-3, the electromagnetic particle simulation parallel algorithm and dynamic load balancing algorithm are designed in the particle simulation software. The implementation of the two algorithms can improve the parallel efficiency from different aspects. The RKA and magnetic isolation oscillator high-power microwave devices are used as the test models. The test results show that the electromagnetic particle simulation parallel algorithm based on the shared memory feature of MPI-3 can improve the efficiency of the software by up to 44%. The efficiency of the dynamic load balancing algorithm based on MPI-3 can also be improved by up to 38%. (c) 2022 Author(s). All article content, except where otherwise noted, is licensed under a Creative Commons Attribution (CC BY) license (http://***/licenses/by/4.0/).

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Optimization and Augmentation for Data parallel Contour Trees

引用

IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022年第10期28卷 3471-3485页

作者： Carr, Hamish A. Ruebel, Oliver Weber, Gunther H. Ahrens, James P. Univ Leeds Leeds LS2 9JT W Yorkshire England Lawrence Berkeley Natl Lab Computat Res Div Berkeley CA 94720 USA Univ Calif Davis Dept Comp Sci Davis CA 95616 USA Los Alamos Natl Lab Livermore NM 87545 USA

Contour trees are used for topological data analysis in scientific visualization. While originally computed with serial algorithms, recent work has introduced a vector-parallel algorithm. However, this algorithm is relatively slow for fully augmented contour trees which are needed for many practical data analysis tasks. We therefore introduce a representation called the hyperstructure that enables efficient searches through the contour tree and use it to construct a fully augmented contour tree in data parallel, with performance on average 6 times faster than the state-of-the-art parallel algorithm in the TTK topological toolkit.

关键词： Vegetation Topology Task analysis Data analysis Tools Level set parallel algorithms Computational topology contour tree parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

A fast parallel max-flow algorithm

引用

JOURNAL OF parallel AND DISTRIBUTED COMPUTING 2022年 169卷 226-241页

作者： Peretz, Yossi Fischler, Yigal Jerusalem Coll Technol Lev Acad Ctr Dept Comp Sci POB 16031 Jerusalem Israel Intel Corp Ha-Marpe St 9 Jerusalem Israel

A new parallel algorithm for the max-flow problem on directed networks with single-source and single -sink is proposed. The algorithm is based on tree sub-networks and on efficient parallel algorithm to compute max-flows on the tree sub-networks. The latter algorithm is proved to be work-optimal and time-optimal. The parallel implementation of the complete algorithm is more efficient than the best known parallel algorithm for the max-flow problem in terms of time-complexity and the sequential implementation of the algorithm achieves the best known sequential time-complexity, without using any complex data-structures or complex manipulations on the network. (C) 2022 Elsevier Inc. All rights reserved.

关键词： Combinatorial optimization Complexity theory Discrete optimization Network flow problems parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Single MCMC chain parallelisation on decision trees

引用

ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE 2025年第1期93卷 219-232页

作者： Drousiotis, Efthyvoulos Spirakis, Paul Univ Liverpool Dept Elect Engn & Elect Liverpool L69 3BX England Univ Liverpool Dept Comp Sci Liverpool L69 3BX England

Decision trees (DT) are highly famous in machine learning and usually acquire state-of-the-art performance. Despite that, well-known variants like CART, ID3, random forest, and boosted trees miss a probabilistic version that encodes prior assumptions about tree structures and shares statistical strength between node parameters. Existing work on Bayesian DT depends on Markov Chain Monte Carlo (MCMC), which can be computationally slow, especially on high dimensional data and expensive proposals. In this study, we propose a method to parallelise a single MCMC DT chain on an average laptop or personal computer that enables us to reduce its run-time through multi-core processing while the results are statistically identical to conventional sequential implementation. We also calculate the theoretical and practical reduction in run time, which can be obtained utilising our method on multi-processor architectures. Experiments showed that we could achieve 18 times faster running time provided that the serial and the parallel implementation are statistically identical.

关键词： parallel algorithms Machine learning MCMC decision tree

来源：评论

学校读者我要写书评

暂无评论

Asynchronous parallel, Sparse Approximated SVRG for High-Dimensional Machine Learning

引用

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 2022年第12期34卷 5636-5648页

作者： Shang, Fanhua Huang, Hua Fan, Jun Liu, Yuanyuan Liu, Hongying Liu, Jianhui Xidian Univ Sch Artificial Intelligence Shenzhen Peoples R China Peng Cheng Lab Shenzhen Peoples R China Xidian Univ Sch Artificial Intelligence Key Lab Intelligent Percept & Image Understanding Minist Educ Xian 710071 Peoples R China Hebei Univ Technol Inst Math Tianjin 430068 Peoples R China

With the increasing of the data size and the development of multi-core computers, asynchronous parallel stochastic optimization algorithms such as KroMagnon have gained significant attention. In this paper, we propose a new Sparse approximation and asynchronous parallel Stochastic Variance Reduced Gradient (SSVRG) method for sparse and high-dimensional machine learning problems. Unlike standard SVRG and its asynchronous parallel variant, KroMagnon, the snapshot point of SSVRG is set to the average of all the iterates in the previous epoch, which allows it to take much larger learning rates and also makes it more robust to the choice of learning rates. In particular, we use the sparse approximation of the popular SVRG estimator to perform completely sparse updates at all iterations. Therefore, SSVRG has a much lower per-iteration computational cost than its dense counterpart, SVRG++, and is very friendly to asynchronous parallel implementation. Moreover, we provide the convergence guarantees of SSVRG for both strongly convex and non-strongly convex problems, while existing asynchronous algorithms (e.g., KroMagnon and ASAGA) only have convergence guarantees for strongly convex problems. Finally, we extend SSVRG to non-smooth and asynchronous parallel settings. Numerical experimental results demonstrate that SSVRG converges significantly faster than the state-of-the-art asynchronous parallel methods, e.g., KroMagnon, and is usually more than three orders of magnitude faster than SVRG++.

关键词： Convergence Machine learning Stochastic processes Acceleration Radio frequency parallel algorithms Optimization Empirical risk minimization stochastic optimization variance reduction asynchronous parallel sparse approximation

来源：评论

学校读者我要写书评

暂无评论

parallel Longest Increasing Subsequence and van Emde Boas Trees 23

Parallel Longest Increasing Subsequence and van Emde Boas Tr...

引用

35th ACM Symposium on parallelism in algorithms and Architectures (SPAA)

作者： Gu, Yan Men, Ziyang Shen, Zheqi Sun, Yihan Wan, Zijin UC Riverside Riverside CA 92521 USA

ISBN: (纸本)9781450395458

This paper studies parallel algorithms for the longest increasing subsequence (LIS) problem. Let.. be the input size and k be the LIS length of the input. Sequentially, LIS is a simple problem that can be solved using dynamic programming (DP) in O(n log n) work. However, parallelizing LIS is a long-standing challenge. We are unaware of any parallel LIS algorithm that has optimal O(n log n) work and non-trivial parallelism (i.e., (O) over tilde (k) or o(n) span). This paper proposes a parallel LIS algorithm that costs O(n log k) work, (O) over tilde (k) span, and O(n) space, and is much simpler than the previous parallel LIS algorithms. We also generalize the algorithm to a weighted version of LIS, which maximizes the weighted sum for all objects in an increasing subsequence. To achieve a better work bound for the weighted LIS algorithm, we designed parallel algorithms for the van Emde Boas (vEB) tree, which has the same structure as the sequential vEB tree, and supports work-efficient parallel batch insertion, deletion, and range queries. We also implemented our parallel LIS algorithms. Our implementation is light-weighted, efficient, and scalable. On input size 10(9), our LIS algorithm outperforms a highly-optimized sequential algorithm (with O(n log k) cost) on inputs with k <= 3 x 10(5). Our algorithm is also much faster than the best existing parallel implementation by Shen et al. (2022) on all input instances.

关键词： parallel algorithms longest increasing subsequence van Emde Boas tree dynamic programming parallel data structure

来源：评论

学校读者我要写书评

暂无评论

Finding Multiple Optimal Solutions to an Integer Linear Program by Random Perturbations of Its Objective Function

引用

algorithms 2025年第3期18卷 140-140页

作者： Schulhof, Noah Sukprasert, Pattara Ruppin, Eytan Khuller, Samir Schaffer, Alejandro A. NCI NIH Canc Data Sci Lab Bethesda MD 20892 USA Northwestern Univ Dept Comp Sci Evanston IL 60201 USA Databricks Inc San Francisco CA 94105 USA

Integer linear programs (ILPs) and mixed integer programs (MIPs) often have multiple distinct optimal solutions, yet the widely used Gurobi optimization solver returns certain solutions at disproportionately high frequencies. This behavior is disadvantageous, as, in fields such as biomedicine, the identification and analysis of distinct optima yields valuable domain-specific insights that inform future research directions. In the present work, we introduce MORSE (Multiple Optima via Random Sampling and careful choice of the parameter Epsilon), a randomized, parallelizable algorithm to efficiently generate multiple optima for ILPs. MORSE maps multiplicative perturbations to the coefficients in an instance's objective function, generating a modified instance that retains an optimum of the original problem. We formalize and prove the above claim in some practical conditions. Furthermore, we prove that for 0/1 selection problems, MORSE finds each distinct optimum with equal probability. We evaluate MORSE using two measures;the number of distinct optima found in r independent runs, and the diversity of the list (with repetitions) of solutions by average pairwise Hamming distance and Shannon entropy. Using these metrics, we provide empirical results demonstrating that MORSE outperforms the Gurobi method and unweighted variations of the MORSE method on a set of 20 Mixed Integer Programming Library (MIPLIB) instances and on a combinatorial optimization problem in cancer genomics.

关键词： integer linear program selection problems multiple optima randomized algorithms parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

An Attention-Based parallel Algorithm for Hyperspectral Skin Cancer Classification on Low-Power GPUs 26

An Attention-Based Parallel Algorithm for Hyperspectral Skin...

引用

26th Euromicro Conference on Digital System Design (DSD) / 49th Euromicro Conference on Software Engineering and Advanced Applications (SEAA)

作者： Torti, Emanuele Gazzoni, Marco Marenzi, Elisa Leon, Raquel Callico, Gustavo Marrero Danese, Giovanni Leporati, Francesco Univ Pavia Dept Elect Comp & Biomed Engn Pavia Italy Univ Las Palmas Gran Canaria Inst Appl Microelect IUMA Las Palmas Gran Canaria Spain

ISBN: (纸本)9798350344196

Recently, several medical applications have relied on hyperspectral imaging. This technology enables both automated diagnosis and surgeon guidance. The employed algorithms adopt machine and deep learning methods to classify the images. In particular, Vision Transformers are a recent deep architecture that have been used to classify hyperspectral images of skin cancers achieving interesting results. However, deep architectures are computationally intensive and parallel architectures are mandatory to ensure a fast classification (depending on the application type even in real time). In this paper, we propose a parallel Vision Transformer architecture exploiting a low power GPU targeting the development of a portable diagnostic device. The classification time and power consumption of the low power board are compared with the performance of a desktop GPU. The results clearly highlight the suitability of the low power GPU to develop a portable diagnostic system based on hyperspectral imaging .

关键词： medical hyperspectral imaging low power GPU Vision Transformer parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：