This article presents algorithms for temporal parallelization of Bayesian smoothers. We define the elements and the operators to pose these problems as the solutions to all-prefix-sums operations for which efficient p...
详细信息
This article presents algorithms for temporal parallelization of Bayesian smoothers. We define the elements and the operators to pose these problems as the solutions to all-prefix-sums operations for which efficient parallel scan-algorithms are available. We present the temporal parallelization of the general Bayesian filtering and smoothing equations, and specialize them to linear/Gaussian models. The advantage of the proposed algorithms is that they reduce the linear complexity of standard smoothing algorithms with respect to time to logarithmic.
In this paper, we study submodular maximization under a matroid constraint in the adaptive complexity model. This model was recently introduced in the context of submodular optimization to quantify the information the...
详细信息
In this paper, we study submodular maximization under a matroid constraint in the adaptive complexity model. This model was recently introduced in the context of submodular optimization to quantify the information theoretic complexity of black-box optimization in a parallel computation model. Despite the burst in work on submodular maximization in the adaptive complexity model, the fundamental problem of maximizing a monotone submodular function under a matroid constraint has remained elusive. In particular, all known techniques fail for this problem and there are no known constant factor approximation algorithms whose adaptivity is sublinear in the rank of the matroid k or in the worst case sublinear in the size of the ground set n. We present an algorithm that has an approximation guarantee arbitrarily close to the optimal 1 - 1/e for monotone submodular maximization under a matroid constraint and has near-optimal adaptivity of O(log (n) log (k)). This result is obtained using a novel technique of adaptive sequencing, which departs from previous techniques for submodular maximization in the adaptive complexity model. In addition to our main result, we show how to use this technique to design other approximation algorithms with strong approximation guarantees and polylogarithmic adaptivity.
Computer-Generated Holography (CGH) algorithms simulate numerical diffraction, being applied in particular for holographic display technology. Due to the wave-based nature of diffraction, CGH is highly computationally...
详细信息
Computer-Generated Holography (CGH) algorithms simulate numerical diffraction, being applied in particular for holographic display technology. Due to the wave-based nature of diffraction, CGH is highly computationally intensive, making it especially challenging for driving high-resolution displays in real-time. To this end, we propose a technique for efficiently calculating holograms of 3D line segments. We express the solutions analytically and devise an efficiently computable approximation suitable for massively parallel computing architectures. The algorithms are implemented on a GPU (with CUDA), and we obtain a 70-fold speedup over the reference point-wise algorithm with almost imperceptible quality loss. We report real-time frame rates for CGH of complex 3D line-drawn objects, and validate the algorithm in both a simulation environment as well as on a holographic display setup.
In this paper, we propose and analyze the parallel Robin-Robin domain decomposition method based on the modified characteristic finite element method for the time-dependent dual-porosity-Navier-Stokes model with the B...
详细信息
In this paper, we propose and analyze the parallel Robin-Robin domain decomposition method based on the modified characteristic finite element method for the time-dependent dual-porosity-Navier-Stokes model with the Beavers-Joseph interface condition. For the coupling terms, we treat them in an explicit manner which takes advantage of information obtained in previous time steps to construct a non-iteration domain decomposition method. By this means, two single dual-porosity equations and a single Navier-Stokes equation are needed to solve at each time. In particular, we solve the Navier-Stokes equation by the modified characteristic finite element method, which avoids the computational inefficiency caused by the nonlinear convection term. Furthermore, we prove the error convergence of solutions by mathematical induction, whose proof implies the uniform L-infinity-boundedness of the fully discrete velocity solution in conduit flow. Finally, some numerical examples are presented to show the effectiveness and efficiency of the proposed method.
This paper proposes a synchronous parallel block coordinate descent algorithm for minimizing a composite function,which consists of a smooth convex function plus a non-smooth but separable convex *** to the generaliza...
详细信息
This paper proposes a synchronous parallel block coordinate descent algorithm for minimizing a composite function,which consists of a smooth convex function plus a non-smooth but separable convex *** to the generalization of the proposed method,some existing synchronous parallel algorithms can be considered as special *** tackle high dimensional problems,the authors further develop a randomized variant,which randomly update some blocks of coordinates at each round of *** proposed parallel algorithms are proven to have sub-linear convergence rate under rather mild *** numerical experiments on solving the large scale regularized logistic regression with 1 norm penalty show that the implementation is quite *** authors conclude with explanation on the observed experimental results and discussion on the potential improvements.
Delaunay Triangulation(DT) is one of the important geometric problems that is used in various branches of knowledge such as computer vision, terrain modeling, spatial clustering and networking. Kinetic data structures...
详细信息
The dark channel prior (DCP) algorithm has been widely used in the field of image defogging because of its simple theory and clear restoration result. However, the DCP algorithm has significant limitations. This study...
详细信息
The dark channel prior (DCP) algorithm has been widely used in the field of image defogging because of its simple theory and clear restoration result. However, the DCP algorithm has significant limitations. This study clarifies the relationship between halo artfacts and the size of the dark channel patch of the DCP algorithm and analyses the reason why the colour of close-range white objects appears distorted in the restored images. An amended DCP method is then proposed to solve these problems, utilising a locally variable weighted 4-directional L-1 regularisation and a corresponding parallel algorithm to optimise the transmission. A deep neural network, 4DL(1)R-net, is then trained to further enhance the processing speed. Extensive experiments demonstrate that this method is effective. The proposed method can obtain clear details, maintain the natural clarity of images, and achieve significant improvements over state-of-the-art methods.
We study the problem of sampling a uniformly random directed rooted spanning tree, also known as an arborescence, from a possibly weighted directed graph. Classically, this problem has long been known to be polynomial...
详细信息
Face recognition has become a fundamental biometric tool that ensures identification of people. Besides a high computational cost, it constitutes an open problem for identifying faces under ideal conditions as well as...
详细信息
Face recognition has become a fundamental biometric tool that ensures identification of people. Besides a high computational cost, it constitutes an open problem for identifying faces under ideal conditions as well as those under general conditions. Though the advent of high memory and inexpensive computer technologies has made the implementation of face recognition possible in several devices and authentication systems, achieving 100% face recognition in real time is still a challenging task. This paper implements an evolutionary computer genetic algorithm for optimizing the number of interest points on faces, intended to get a quick and precise facial recognition using local analysis texture technique applied to CBIR methodology. Our approach was evaluated using different databases, getting an efficient facial recognition of up to 100% considering only seven interest points from a total of 54 cited in the literature. The interest points reduction was possible through a parallel implementation of our approach using a 54-processor cluster that executes the similar task up to 300% more faster.
The paper presents parallel algorithms for calculating the exact value of all-terminal reliability of a network with unreliable edges and absolutely reliable nodes. A random graph is used as a model of such network. T...
详细信息
暂无评论