The parallel alternating direction method of multipliers (ADMM) algorithms have gained popularity in statistics and machine learning due to their efficient handling of large sample data problems. However, the parallel...
详细信息
In probabilistic state inference, we seek to estimate the state of an (autonomous) agent from noisy observations. It can be shown that, under certain assumptions, finding the estimate is equivalent to solving a linear...
详细信息
In probabilistic state inference, we seek to estimate the state of an (autonomous) agent from noisy observations. It can be shown that, under certain assumptions, finding the estimate is equivalent to solving a linear least squares problem. Solving such a problem is done by calculating the upper triangularmatrixRfrom the coefficient matrix A, using the QR or Cholesky factorizations;this matrix is commonly referred to as the "square root matrix". In sequential estimation problems, we are often interested in periodic optimization of the state variable order, e.g., to reduce fill-in, or to apply a predictive variable ordering tactic;however, changing the variable order implies expensive re-factorization of the system. Thus, we address the problem of modifying an existing square root matrix R, to convey reordering of the variables. To this end, we identify several conclusions regarding the effect of column permutation on the factorization, to allow efficient modification of R, without accessing A at all, or with minimal re-factorization. The proposed parallelizable algorithm achieves a significant improvement in performance over the state-of-the-art incremental Smoothing AndMapping (iSAM2) algorithm, which utilizes incremental factorization to update R.
Simulation of high-power microwave source devices generally uses parallel algorithms to speed up the operation. In recent years, with the upgrade of parallel technology, the parallel efficiency of the particle simulat...
详细信息
Simulation of high-power microwave source devices generally uses parallel algorithms to speed up the operation. In recent years, with the upgrade of parallel technology, the parallel efficiency of the particle simulation software has been further improved. Existing MPI-2 parallel technology of particle simulation software CHIPIC realizes the access to the local memory space of other processes through message passing. The new version of the MPI-3 standard provides the shared memory feature, which allows the data to be directly called by each process in the shared memory window, which reduces the information transmission. In this paper, based on the shared memory feature of MPI-3, the electromagnetic particle simulation parallel algorithm and dynamic load balancing algorithm are designed in the particle simulation software. The implementation of the two algorithms can improve the parallel efficiency from different aspects. The RKA and magnetic isolation oscillator high-power microwave devices are used as the test models. The test results show that the electromagnetic particle simulation parallel algorithm based on the shared memory feature of MPI-3 can improve the efficiency of the software by up to 44%. The efficiency of the dynamic load balancing algorithm based on MPI-3 can also be improved by up to 38%. (c) 2022 Author(s). All article content, except where otherwise noted, is licensed under a Creative Commons Attribution (CC BY) license (http://***/licenses/by/4.0/).
Multinomial Logistic Regression is a well-studied tool for classification and has been widely used in fields like image processing, computer vision and, bioinformatics, to name a few. Under a supervised classification...
详细信息
Multinomial Logistic Regression is a well-studied tool for classification and has been widely used in fields like image processing, computer vision and, bioinformatics, to name a few. Under a supervised classification scenario, a Multinomial Logistic Regression model learns a weight vector to differentiate between any two classes by optimizing over the likelihood objective. With the advent of big data, the inundation of data has resulted in large dimensional weight vector and has also given rise to a huge number of classes, which makes the classical methods applicable for model estimation not computationally viable. To handle this issue, we here propose a parallel iterative algorithm: parallel Iterative Algorithm for MultiNomial LOgistic Regression ( PIANO ) which is based on the Majorization Minimization procedure, and can parallely update each element of the weight vectors. Further, we also show that PIANO can be easily extended to solve the Sparse Multinomial Logistic Regression problem -an extensively studied problem because of its attractive feature selection property. In particular, we work out the extension of PIANO to solve the Sparse Multinomial Logistic Regression problem with epsilon(1) and t 0 regularizations. We also prove that PIANO converges to a stationary point of the Multinomial and the Sparse Multinomial Logistic Regression problems. Simulations were conducted to compare PIANO with the existing methods, and it was found that the proposed algorithm performs better than the existing methods in terms of speed of convergence.(C) 2022 Elsevier B.V. All rights reserved.
In this paper, we study submodular maximization under a matroid constraint in the adaptive complexity model. This model was recently introduced in the context of submodular optimization to quantify the information the...
详细信息
In this paper, we study submodular maximization under a matroid constraint in the adaptive complexity model. This model was recently introduced in the context of submodular optimization to quantify the information theoretic complexity of black-box optimization in a parallel computation model. Despite the burst in work on submodular maximization in the adaptive complexity model, the fundamental problem of maximizing a monotone submodular function under a matroid constraint has remained elusive. In particular, all known techniques fail for this problem and there are no known constant factor approximation algorithms whose adaptivity is sublinear in the rank of the matroid k or in the worst case sublinear in the size of the ground set n. We present an algorithm that has an approximation guarantee arbitrarily close to the optimal 1 - 1/e for monotone submodular maximization under a matroid constraint and has near-optimal adaptivity of O(log (n) log (k)). This result is obtained using a novel technique of adaptive sequencing, which departs from previous techniques for submodular maximization in the adaptive complexity model. In addition to our main result, we show how to use this technique to design other approximation algorithms with strong approximation guarantees and polylogarithmic adaptivity.
Computer-Generated Holography (CGH) algorithms simulate numerical diffraction, being applied in particular for holographic display technology. Due to the wave-based nature of diffraction, CGH is highly computationally...
详细信息
Computer-Generated Holography (CGH) algorithms simulate numerical diffraction, being applied in particular for holographic display technology. Due to the wave-based nature of diffraction, CGH is highly computationally intensive, making it especially challenging for driving high-resolution displays in real-time. To this end, we propose a technique for efficiently calculating holograms of 3D line segments. We express the solutions analytically and devise an efficiently computable approximation suitable for massively parallel computing architectures. The algorithms are implemented on a GPU (with CUDA), and we obtain a 70-fold speedup over the reference point-wise algorithm with almost imperceptible quality loss. We report real-time frame rates for CGH of complex 3D line-drawn objects, and validate the algorithm in both a simulation environment as well as on a holographic display setup.
This paper proposes a synchronous parallel block coordinate descent algorithm for minimizing a composite function,which consists of a smooth convex function plus a non-smooth but separable convex *** to the generaliza...
详细信息
This paper proposes a synchronous parallel block coordinate descent algorithm for minimizing a composite function,which consists of a smooth convex function plus a non-smooth but separable convex *** to the generalization of the proposed method,some existing synchronous parallel algorithms can be considered as special *** tackle high dimensional problems,the authors further develop a randomized variant,which randomly update some blocks of coordinates at each round of *** proposed parallel algorithms are proven to have sub-linear convergence rate under rather mild *** numerical experiments on solving the large scale regularized logistic regression with 1 norm penalty show that the implementation is quite *** authors conclude with explanation on the observed experimental results and discussion on the potential improvements.
The dark channel prior (DCP) algorithm has been widely used in the field of image defogging because of its simple theory and clear restoration result. However, the DCP algorithm has significant limitations. This study...
详细信息
The dark channel prior (DCP) algorithm has been widely used in the field of image defogging because of its simple theory and clear restoration result. However, the DCP algorithm has significant limitations. This study clarifies the relationship between halo artfacts and the size of the dark channel patch of the DCP algorithm and analyses the reason why the colour of close-range white objects appears distorted in the restored images. An amended DCP method is then proposed to solve these problems, utilising a locally variable weighted 4-directional L-1 regularisation and a corresponding parallel algorithm to optimise the transmission. A deep neural network, 4DL(1)R-net, is then trained to further enhance the processing speed. Extensive experiments demonstrate that this method is effective. The proposed method can obtain clear details, maintain the natural clarity of images, and achieve significant improvements over state-of-the-art methods.
Delaunay Triangulation(DT) is one of the important geometric problems that is used in various branches of knowledge such as computer vision, terrain modeling, spatial clustering and networking. Kinetic data structures...
详细信息
We study the problem of sampling a uniformly random directed rooted spanning tree, also known as an arborescence, from a possibly weighted directed graph. Classically, this problem has long been known to be polynomial...
详细信息
暂无评论