Sequential minimal optimization (SMO) is one popular algorithm for training support vector machine (SVM), but it still requires a large amount of computation time for solving large size problems. This paper proposes o...
详细信息
Sequential minimal optimization (SMO) is one popular algorithm for training support vector machine (SVM), but it still requires a large amount of computation time for solving large size problems. This paper proposes one parallel implementation of SMO for training SVM. The parallel SMO is developed using message passing interface (MPI). Specifically, the parallel SMO first partitions the entire training data set into smaller subsets and then simultaneously runs multiple CPU processors to deal with each of the partitioned data sets. Experiments show that there is great speedup on the adult data set and the Mixing National Institute of Standard and Technology (MNIST) data set when many processors are used. There are also satisfactory results on the Web data set.
A parallel version of sequential minimal optimization (SMO) is developed in this paper for fast training support vector machine (SVM). Up to now, SMO is one popular algorithm for training SVM, but it still requires a ...
详细信息
A parallel version of sequential minimal optimization (SMO) is developed in this paper for fast training support vector machine (SVM). Up to now, SMO is one popular algorithm for training SVM, but it still requires a large amount of computation time for solving large size problems. The parallel SMO is developed based on message passing interface (MPI). Unlike the sequential SMO which handle all the training data points using one CPU processor, the parallel SMO first partitions the entire training data set into smaller subsets and then simultaneously runs multiple CPU processors to deal with each of the partitioned data sets. Experiments show that there is great speedup on the adult data set, the MNIST data set and IDEVAL data set when many processors are used. There are also satisfactory results on the Web data set. This work is very useful for the research where multiple CPU processors machine is available. (c) 2006 Elsevier B.V. All rights reserved.
This work describes a parallel divide-and-conquer Delaunay triangulation scheme. This algorithm finds the affected zone, which covers the triangulation and may be modified when two sub-block triangulations are merged....
详细信息
This work describes a parallel divide-and-conquer Delaunay triangulation scheme. This algorithm finds the affected zone, which covers the triangulation and may be modified when two sub-block triangulations are merged. Finding the affected zone can reduce the amount of data required to be transmitted between processors. The time complexity of the divide-and-conquer scheme remains O (n log n), and the affected region can be located in O (n) time steps, where n denotes the number of points. The code was implemented with C, FORTRAN and MPI, making it portable to many computer systems. Experimental results on an IBM SP2 show that a parallel efficiency of 44-95% for general distributions can be attained on a 16-node distributed memory system. Copyright (c) 2006 John Wiley & Sons, Ltd.
This paper firstly introduces the structure and working principle of DSP-based parallel system, parallel accelerating board and SHARC DSP chip. Then it pays attention to investigating the system’s programming charact...
详细信息
This paper firstly introduces the structure and working principle of DSP-based parallel system, parallel accelerating board and SHARC DSP chip. Then it pays attention to investigating the system’s programming characteristics, especially the mode of communication, discussing how to design parallel algorithms and presenting a domain-decomposition-based complete multi-grid parallel algorithm with virtual boundary forecast (VBF) to solve a lot of large-scale and complicated heat problems. In the end, Mandelbrot Set and a non-linear heat transfer equation of ceramic/metal composite material are taken as examples to illustrate the implementation of the proposed algorithm. The results showed that the solutions are highly efficient and have linear speedup.
With the rapid development of high-speed network technology,the cluster systems have been the main platform of parallel *** of their high communication delay,some parallel algorithms of fine grain are not fit for runn...
详细信息
With the rapid development of high-speed network technology,the cluster systems have been the main platform of parallel *** of their high communication delay,some parallel algorithms of fine grain are not fit for running in this ***, it is necessary to study their parallel achievements in cluster *** terms of that,a new way for QR decomposition of matrix is proposed and the coarse grain parallel algorithm is *** the process of designing these parallel methods,the separately principle was based on,the original matrixes were divided into some blocks, then each block was distributed into various node machines, which run the submission in *** was a much better proposal to the cluster system,which had no more *** last,a simulation was *** solution obtained shows that the designing parallel algorithm has much higher speedup in the cluster system.
Fractal video compression is a relatively new video compression method. Its attraction is due to the high compression ratio and the simple decompression algorithm. But its computational complexity is high and as a res...
详细信息
Fractal video compression is a relatively new video compression method. Its attraction is due to the high compression ratio and the simple decompression algorithm. But its computational complexity is high and as a result parallel algorithms on high performance machines become one way out. In this study we partition the matching search, which occupies the majority of the work in a fractal video compression process, into small tasks and implement them in two distributed computing environments, one using DCOM and the other using NET Remoting technology, based on a local area network consists of loosely coupled PCs. Experimental results show that the parallel algorithm is able to achieve a high speedup in these distributed environments. (c) 2005 Elsevier Inc. All rights reserved.
Generally speaking,tasks scheduling in multiprocessor systems is NP-hard even if under strictly simplifying *** this paper,we develop a parallel ACO to solve the multiprocessor scheduling on distributed memory *** on ...
详细信息
Generally speaking,tasks scheduling in multiprocessor systems is NP-hard even if under strictly simplifying *** this paper,we develop a parallel ACO to solve the multiprocessor scheduling on distributed memory *** on message passing interface,multiple sub-ant-colonies evolve respectively and interchange the information every fixed k iteration to enhance the search ability of *** experiment results show that the proposed algorithm performs better in solution quality as well as in scalability.
Dimension-adaptive sparse grid interpolation is a powerful tool to obtain surrogate functions of smooth, medium to high-dimensional objective models. In case of expensive models, the efficiency of the sparse grid algo...
详细信息
Dimension-adaptive sparse grid interpolation is a powerful tool to obtain surrogate functions of smooth, medium to high-dimensional objective models. In case of expensive models, the efficiency of the sparse grid algorithm is governed by the time required for the function evaluations. In this paper, we first briefly analyze the inherent parallelism of the standard dimension-adaptive algorithm. Then, we present an enhanced version of the standard algorithm that permits, in each step of the algorithm, a specified number (equal to the number of desired processes) of function evaluations to be executed in parallel, thereby increasing the parallel efficiency.
A scalable fast parallel sorting algorithm on linear array with reconfiguarble pipeline optical bus system(LARPBS) is *** algorithm improves ***'s fast parallel sorting algorithm on LARPBS which uses N processor...
详细信息
A scalable fast parallel sorting algorithm on linear array with reconfiguarble pipeline optical bus system(LARPBS) is *** algorithm improves ***'s fast parallel sorting algorithm on LARPBS which uses N processors to sort N elements in average O(N) time or optimally O(logN) time.)). We illustrate the algorithm can sort N elements in O(NlogN/p) time in the best case and in O(N/p) in the worst case using p(p≤N ) processors and hence show the algorithm is highly *** also present a fast and efficient parallel sorting algorithm on LARPBS which uses N processors in O(log N ) time in the best case and O(N) time in the worst case.
暂无评论