In our previous work, a parallelizing sequential minimization optimization was proposed, where the algorithm was executed successfully but its convergence cannot be guaranteed in some cases. In this paper, an improved...
详细信息
In our previous work, a parallelizing sequential minimization optimization was proposed, where the algorithm was executed successfully but its convergence cannot be guaranteed in some cases. In this paper, an improved version is proposed, which can avoid falling into the endless loops. In the proposed method, the multiple violation pairs are selected in each step, and depending on the decrement value of the objective function, a single-pair update or multiple-pair update is determined. Experimental results show that the proposed method is more effective than the previous methods. The parallel algorithm is well executed while the accuracy is maintained and the convergence is completely guaranteed.
The new parallel incremental support vector machine (SVM) algorithm aims at classifying very large datasets on graphics processing units (GPUs). SVM and kernel related methods have shown to build accurate models but t...
详细信息
ISBN:
(纸本)9781424423798
The new parallel incremental support vector machine (SVM) algorithm aims at classifying very large datasets on graphics processing units (GPUs). SVM and kernel related methods have shown to build accurate models but the learning task usually needs a quadratic program so that the learning task for large datasets requires large memory capacity and long time. We extend a recent Least Squares SVM (LS-SVM) proposed by Suykens and Vandewalle for building incremental, parallel algorithm. The new algorithm uses graphics processors to gain high performance at low cost. Numerical test results on UCI, Delve dataset repositories showed that our parallel incremental algorithm using GPUs is about 65 times faster than a CPU implementation and often significantly over 1000 times faster than state-of-the-art algorithms LibSVM, SVM-perf and CB-SVM.
In X-ray computed tomography (CT) the X rays are used to obtain the projection data needed to generate an image of the inside of an object. The image can be generated with different techniques. Algebraic methods a...
详细信息
In X-ray computed tomography (CT) the X rays are used to obtain the projection data needed to generate an image of the inside of an object. The image can be generated with different techniques. Algebraic methods are more suitable for the reconstruction of images with high contrast and precision in noisy conditions and from a small number of projections. Their use may be important in portable scanners for their functionality in emergency situations. However, in practice, these methods are not widely used due to the high computational cost of their implementation. In this work we analyze and propose the usage of the PETSc library for the optimal usage of a system in the parallel reconstruction of images. Also, the quality comparison of the images reconstructed with both methods, analytical Filtered Back projection (FBP) and iterative LSQR, has been performed.
This paper develops a framework based on distributed model predictive control to solve the optimal consensus problem for constrained multi-agent systems. Taking both transient performance and final consensus state int...
详细信息
This paper develops a framework based on distributed model predictive control to solve the optimal consensus problem for constrained multi-agent systems. Taking both transient performance and final consensus state into consideration, the optimization problem in each prediction horizon is formed as a coupled optimization problem containing a non-separable cost function with constraints. An efficient distributed algorithm is proposed with the distributed convergence conditions on the auxiliary parameters, which makes each agent solve its subproblems in parallel iterations, enhancing the running efficiency of the algorithm. The stability of the closed-loop system is analyzed, providing distributed stability conditions that only depend on the local information of each agent. Numerical simulations verify the validity of theoretical results by utilizing the proposed approach to a multi-robot system.
According to the real geology background of Tarim foreland basin we build 3D seismic data volume of Tarim area by using 3D arbitrary difference precise integration (ADPI) algorithm. It is very beneficial for the p...
详细信息
According to the real geology background of Tarim foreland basin we build 3D seismic data volume of Tarim area by using 3D arbitrary difference precise integration (ADPI) algorithm. It is very beneficial for the processing and explanation of 3D seismic material of Tarim area. Compared with conventional differential method, the 3D ADPI algorithm greatly improves the precision by using local integral semianalytical method in time domain to get the recursion operator of wave equations. And we adopt stable factor constraints, thus the stability of calculation gets much better. By using an improved adaptive absorbing boundary and the parallelization of serial program, the time consuming of 3D forward modeling is greatly reduced. In the research we gathered 300 shots' Omni-directional seismic data volume. The whole data volume approximates 2T. Compare the geology model with actual seismic records we find 3D ADPI forward modeling can accurately show the structure and layers information of geology model. In complex region it can describe geology structure and seismic physical parameters such as amplitude, frequency, phase and so on.
We present a parallel implementations on GPU of an heuristic for solving the Vehicle Routing Problem (VRP) with single and with multi depot. To our knowledge, this is the first GPU implementation of such class of heur...
详细信息
ISBN:
(纸本)9781509004799
We present a parallel implementations on GPU of an heuristic for solving the Vehicle Routing Problem (VRP) with single and with multi depot. To our knowledge, this is the first GPU implementation of such class of heuristics. Our solution for the classical VRP computes in parallel an initial solution (tours) and then iteratively it improves the costs of all pairs of neighbor tours. The multi depot case is solved by decomposing the problem in several independent basic VRP that we solve in parallel. Obtained experimental results under CUDA show that the proposed implementations exploit efficiently the parallelism and the power of the GPU.
For improving video coding efficiency,sub-pixel motion estimation(ME) is used extensively in the existing video coding *** quarter pixel ME is one of the high complexity tools in H.264/*** this paper,a parallel quarte...
详细信息
For improving video coding efficiency,sub-pixel motion estimation(ME) is used extensively in the existing video coding *** quarter pixel ME is one of the high complexity tools in H.264/*** this paper,a parallel quarter block motion estimation algorithm that not only accelerates the process of sub-pixel motion estimation but also maintains accuracy as that of the original algorithm is *** Intel P4 CPU,the SIMD(single instruction multiple data) technique is commonly used to provide an execution *** implementation of this algorithm using parallel processing on P4 platform is *** proposed algorithm satisfies in particular the requirements of low-rate real-timed video *** results show that the optimized video encoder is more than 13.5 times faster than the original reference software while keeping the accuracy of the latter approximately.
Many efficient systolic algorithms in block computation of digital filters have been *** highly concurrent structures can be implemented from these block systolic computation *** the performances of these highly concu...
详细信息
Many efficient systolic algorithms in block computation of digital filters have been *** highly concurrent structures can be implemented from these block systolic computation *** the performances of these highly concurrent structures would be described with the total computation times and the total processors numbers needed for each algorithms.A comparision with the sequential algorithms using the tables of speedup rate and efficiency for each block algorithm were summarized.
Listing all the maximal cliques of an undirected graph is considered as a NP-complete problem,even if the existing fastest algorithm for listing all the maximal cliques also needs an exponential time complexity,and th...
详细信息
Listing all the maximal cliques of an undirected graph is considered as a NP-complete problem,even if the existing fastest algorithm for listing all the maximal cliques also needs an exponential time complexity,and these algorithms are not able to be reconstructed into parallel processing algorithms,therefore,which are also unsuited to solve all maximal cliques of a complex undirected graph with a large number of maximal cliques or *** paper aims to develop a parallable algorithm for listing all maximal cliques of the large size complex undirected *** we gave some definitions,including adjacent sub-graph,suspended sub-graph and so on,then adopted a partition and recursive strategy to partition equivalently a parent-graph into smaller sub-graphs,so we can get all maximal cliques of parent-graph by recursively listing all maximal cliques of *** have developed a single-thread and a multi-thread program of the algorithm by using Java ***,we compared the performance of our algorithm in a benchmark data set DIMACS with the existing main *** results show that our single-thread program is basically equivalent performance with existing algorithms,but the multi-thread program has better performance than the existing *** our method is to divide the complex problem into some smaller sub-problems and achieve the solution of the original problem by solving these sub-problems,our algorithm is easy to be implemented by using multi-thread,and also easy to be expanded to the parallel algorithm,which has better feasibility and practicability in solving the large scale complex graph.
In this paper, we consider the planar multi-facility Weber problem with restricted zones and non-Euclidean distances, propose an algorithm based on the probability changing method (special kind of genetic algorithms) ...
详细信息
In this paper, we consider the planar multi-facility Weber problem with restricted zones and non-Euclidean distances, propose an algorithm based on the probability changing method (special kind of genetic algorithms) and prove its efficiency for approximate solving this problem by replacing the continuous coordinate values by discrete ones. Version of the algorithm for multiprocessor systems is proposed. Experimental results for a high-performance cluster are given.
暂无评论