检索结果-内蒙古大学图书馆

Conference on Chemical, Biological, Radiological, Nuclear, and Explosives (CBRNE) Sensing XXIII

作者： Lee, Yun Teck Gryazin, Yury Idaho State Univ Pocatello ID 83209 USA

ISBN: (纸本)9781510651098;9781510651081

The objective of this paper is to present an efficient parallel implementation of the iterative compact high-order approximation numerical solver for 3D Helmholtz equation on multicore computers. The high-order parallel iterative algorithm is built upon a combination of a Krylov subspace-type method with a direct parallel Fast Fourier transform (FFT) type preconditioner from the authors' previous work, as shown in Ref. 7. In this paper, we will be presenting the result of our algorithm by computationally simulating data with realistic ranges of parameters in soil and mine-like targets. Our algorithm will also be incorporating second, fourth, and sixth-order compact finite difference schemes. The accuracy and result of the fourth and sixth-order compact approximation will be shown alongside the scalability of our implementation in the parallel programming environment.

关键词： Helmholtz equation subsurface imaging landmines compact finite difference schemes GMRES FFT preconditioners parallel algorithms OpenMP MPI

来源：评论

学校读者我要写书评

暂无评论

A unified consensus-based parallel ADMM algorithm for high-dimensional regression with combined regularizations

arXiv

引用

arXiv 2023年

作者： Wu, Xiaofei Zhang, Zhimin Cui, Zhenyu College of Mathematics and Statistics Chongqing University Chongqing401331 China School of Business Stevens Institute of Technology HobokenNJ07030 United States

The parallel alternating direction method of multipliers (ADMM) algorithm is widely recognized for its effectiveness in handling large-scale datasets stored in a distributed manner, making it a popular choice for solving statistical learning models. However, there is currently limited research on parallel algorithms specifically designed for high-dimensional regression with combined (composite) regularization terms. These terms, such as elastic-net, sparse group lasso, sparse fused lasso, and their nonconvex variants, have gained significant attention in various fields due to their ability to incorporate prior information and promote sparsity within specific groups or fused variables. The scarcity of parallel algorithms for combined regularizations can be attributed to the inherent nonsmoothness and complexity of these terms, as well as the absence of closed-form solutions for certain proximal operators associated with them. In this paper, we propose a unified constrained optimization formulation based on the consensus problem for these types of convex and nonconvex regression problems and derive the corresponding parallel ADMM algorithms. Furthermore, we prove that the proposed algorithm not only has global convergence but also exhibits linear convergence rate. Extensive simulation experiments, along with a financial example, serve to demonstrate the reliability, stability, and scalability of our algorithm. The R package for implementing the proposed algorithms can be obtained at https://***/xfwu1016/CPADMM. Copyright © 2023, The Authors. All rights reserved.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Data-centric workloads with MPI_Sort

引用

JOURNAL OF parallel AND DISTRIBUTED COMPUTING 2024年 187卷

作者： Zulian, P. Ben Bader, S. Fourestey, G. Krause, R. Rossinelli, D. Univ Svizzera Italiana Euler Inst Fac informat Lugano Switzerland Swiss Fed Inst Technol Lausanne Switzerland UniDistance Brig Switzerland Stanford Univ Inst Computat Math Engn Stanford CA 94305 USA

Sorting is a fundamental task in computing and plays a central role in information technology. The advent of rack-scale and warehouse-size data processing shaped the architecture of data analysis platforms towards supercomputing. In turn, established techniques on supercomputers have become relevant to a wider range of application domains. This work is concerned with multi-way mergesort with exact splitting on distributed memory architectures. At its core, our approach leverages a novel and parallel algorithm for multi-way selection problems. Remarkably concise, the algorithm relies on MPI_Allgather and MPI_ReduceScatter_block, two collective communication schemes that find hardware support in most high-end networks. A software implementation of our approach is used to process the Terabyte-size Data Challenge 2 signal, released by the SKA radio telescopes organization. On the supercomputer considered herein, our approach outperforms the state of the art by up to 2.6X using 9,216 cores. Our implementation is released as a compact open source library compliant to the MPI programming model. By supporting the most popular elementary key types, and arbitrary fixed-size value types, the library can be straightforwardly integrated into third-party MPI-based software

关键词： Distributed sorting parallel algorithms Supercomputers

来源：评论

学校读者我要写书评

暂无评论

Partition-Insensitive parallel ADMM Algorithm for High-dimensional Linear Models

arXiv

引用

arXiv 2023年

作者： Wu, Xiaofei Jiang, Jiancheng Zhang, Zhimin College of Mathematics and Statistics Chongqing University China Department of Mathematics and Statistics University of North Carolina at Charlotte United States

The parallel alternating direction method of multipliers (ADMM) algorithms have gained popularity in statistics and machine learning due to their efficient handling of large sample data problems. However, the parallel structure of these algorithms, based on the consensus problem, can lead to an excessive number of auxiliary variables when applied to high-dimensional data, resulting in large computational burden. In this paper, we propose a partition-insensitive parallel framework based on the linearized ADMM (LADMM) algorithm and apply it to solve nonconvex penalized high-dimensional regression problems. Compared to existing parallel ADMM algorithms, our algorithm does not rely on the consensus problem, resulting in a significant reduction in the number of variables that need to be updated at each iteration. It is worth noting that the solution of our algorithm remains largely unchanged regardless of how the total sample is divided, which is known as partition-insensitivity. Furthermore, under some mild assumptions, we prove the convergence of the iterative sequence generated by our parallel algorithm. Numerical experiments on synthetic and real datasets demonstrate the feasibility and validity of the proposed algorithm. We provide a publicly available R software package to facilitate the implementation of the proposed algorithm. Copyright © 2023, The Authors. All rights reserved.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Efficient Modification of the Upper Triangular Square Root Matrix on Variable Reordering

引用

IEEE ROBOTICS AND AUTOMATION LETTERS 2021年第2期6卷 675-682页

作者： Elimelech, Khen Indelman, Vadim Technion Israel Inst Technol Robot & Autonomous Syst Program IL-32000 Haifa Israel Technion Israel Inst Technol Dept Aerosp Engn IL-32000 Haifa Israel

In probabilistic state inference, we seek to estimate the state of an (autonomous) agent from noisy observations. It can be shown that, under certain assumptions, finding the estimate is equivalent to solving a linear least squares problem. Solving such a problem is done by calculating the upper triangularmatrixRfrom the coefficient matrix A, using the QR or Cholesky factorizations;this matrix is commonly referred to as the "square root matrix". In sequential estimation problems, we are often interested in periodic optimization of the state variable order, e.g., to reduce fill-in, or to apply a predictive variable ordering tactic;however, changing the variable order implies expensive re-factorization of the system. Thus, we address the problem of modifying an existing square root matrix R, to convey reordering of the variables. To this end, we identify several conclusions regarding the effect of column permutation on the factorization, to allow efficient modification of R, without accessing A at all, or with minimal re-factorization. The proposed parallelizable algorithm achieves a significant improvement in performance over the state-of-the-art incremental Smoothing AndMapping (iSAM2) algorithm, which utilizes incremental factorization to update R.

关键词： Incremental least squares parallel algorithms probabilistic inference SLAM sparse systems

来源：评论

学校读者我要写书评

暂无评论

Temporal parallelization of Bayesian Smoothers

引用

IEEE TRANSACTIONS ON AUTOMATIC CONTROL 2021年第1期66卷 299-306页

作者： Sarkka, Simo Garcia-Fernandez, Angel F. Aalto Univ Dept Elect Engn & Automat Espoo 02150 Finland Univ Liverpool Dept Elect Engn & Elect Liverpool L69 3GJ Merseyside England

This article presents algorithms for temporal parallelization of Bayesian smoothers. We define the elements and the operators to pose these problems as the solutions to all-prefix-sums operations for which efficient parallel scan-algorithms are available. We present the temporal parallelization of the general Bayesian filtering and smoothing equations, and specialize them to linear/Gaussian models. The advantage of the proposed algorithms is that they reduce the linear complexity of standard smoothing algorithms with respect to time to logarithmic.

关键词： Bayes methods Smoothing methods Mathematical model Computational modeling Kalman filters parallel algorithms Bayesian smoothing Kalman filtering and smoothing parallel computing parallel scan prefix sums

来源：评论

学校读者我要写书评

暂无评论

An Optimal Approximation for Submodular Maximization Under a Matroid Constraint in the Adaptive Complexity Model

引用

OPERATIONS RESEARCH 2021年第5期70卷 2967页

作者： Balkanski, Eric Rubinstein, Aviad Singer, Yaron Columbia Univ Dept Ind Engn & Operat Res New York NY 10027 USA Stanford Univ Dept Comp Sci Stanford CA 94305 USA Harvard Univ Sch Engn & Appl Sci Cambridge MA 02138 USA

In this paper, we study submodular maximization under a matroid constraint in the adaptive complexity model. This model was recently introduced in the context of submodular optimization to quantify the information theoretic complexity of black-box optimization in a parallel computation model. Despite the burst in work on submodular maximization in the adaptive complexity model, the fundamental problem of maximizing a monotone submodular function under a matroid constraint has remained elusive. In particular, all known techniques fail for this problem and there are no known constant factor approximation algorithms whose adaptivity is sublinear in the rank of the matroid k or in the worst case sublinear in the size of the ground set n. We present an algorithm that has an approximation guarantee arbitrarily close to the optimal 1 - 1/e for monotone submodular maximization under a matroid constraint and has near-optimal adaptivity of O(log (n) log (k)). This result is obtained using a novel technique of adaptive sequencing, which departs from previous techniques for submodular maximization in the adaptive complexity model. In addition to our main result, we show how to use this technique to design other approximation algorithms with strong approximation guarantees and polylogarithmic adaptivity.

关键词： submodular optimization parallel algorithms matroids adaptivity

来源：评论

学校读者我要写书评

暂无评论

Real-Time Computation of 3D Wireframes in Computer-Generated Holography

引用

IEEE TRANSACTIONS ON IMAGE PROCESSING 2021年 30卷 9418-9428页

作者： Blinder, David Nishitsuji, Takashi Schelkens, Peter Vrije Univ Brussel VUB Dept Elect & Informat ETRO B-1050 Brussels Belgium IMEC B-3001 Leuven Belgium Tokyo Metropolitan Univ Fac Syst Design Hino Tokyo 1910065 Japan

Computer-Generated Holography (CGH) algorithms simulate numerical diffraction, being applied in particular for holographic display technology. Due to the wave-based nature of diffraction, CGH is highly computationally intensive, making it especially challenging for driving high-resolution displays in real-time. To this end, we propose a technique for efficiently calculating holograms of 3D line segments. We express the solutions analytically and devise an efficiently computable approximation suitable for massively parallel computing architectures. The algorithms are implemented on a GPU (with CUDA), and we obtain a 70-fold speedup over the reference point-wise algorithm with almost imperceptible quality loss. We report real-time frame rates for CGH of complex 3D line-drawn objects, and validate the algorithm in both a simulation environment as well as on a holographic display setup.

关键词： Three-dimensional displays Holography Diffraction Real-time systems Optical diffraction Streaming media Holographic optical components Holography diffraction computer graphics displays approximation methods parallel algorithms optical devices physics computing

来源：评论

学校读者我要写书评

暂无评论

Synchronous parallel Block Coordinate Descent Method for Nonsmooth Convex Function Minimization

引用

Journal of Systems Science & Complexity 2020年第2期33卷 345-365页

作者： DAI Yutong WENG Yang College of Mathematics Sichuan UniversityChengdu 610064China

This paper proposes a synchronous parallel block coordinate descent algorithm for minimizing a composite function,which consists of a smooth convex function plus a non-smooth but separable convex *** to the generalization of the proposed method,some existing synchronous parallel algorithms can be considered as special *** tackle high dimensional problems,the authors further develop a randomized variant,which randomly update some blocks of coordinates at each round of *** proposed parallel algorithms are proven to have sub-linear convergence rate under rather mild *** numerical experiments on solving the large scale regularized logistic regression with 1 norm penalty show that the implementation is quite *** authors conclude with explanation on the observed experimental results and discussion on the potential improvements.

关键词： Block coordinate descent convergence rate convex functions parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

A parallel algorithm for Delaunay triangulation of moving points on the plane

arXiv

引用

arXiv 2023年

作者： Hadiniya, Nazanin Ghodsi, Mohammad Computer Engineering Department Sharif University of Technology Tehran Iran

Delaunay Triangulation(DT) is one of the important geometric problems that is used in various branches of knowledge such as computer vision, terrain modeling, spatial clustering and networking. Kinetic data structures has become very important in computational geometry for dealing with moving objects. However, when dealing with moving points, maintaining a dynamically changing Delaunay triangulation can be challenging. So, In this case, we have to update triangulation repeatedly. If the points move so far, it’s better to rebuild the triangulation. One approach to handle moving points is to use an incremental algorithm. For the case that points move slowly, we can give a faster algorithm than rebuilding. Furthermore, sequential algorithms can be computationally expensive for large datasets. So one way to compute as fast as possible is parallelism. In this paper, we propose a parallel algorithm for moving points. we propose an algorithm that divides datasets into equal partitions and give every partition to one block. Each block satisfay the Delaunay constraints after each time step and uses delete and insert algorithms to do this. We show this algorithm works faster than serial algorithms. Copyright © 2023, The Authors. All rights reserved.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：