Since MODIS data have the feature of huge capacity and multi-spectrum, it needs too much time and frequent I/O operations to correct them by RS software. This paper proposes a parallel algorithm for MODIS Data based o...
详细信息
ISBN:
(纸本)0780390504
Since MODIS data have the feature of huge capacity and multi-spectrum, it needs too much time and frequent I/O operations to correct them by RS software. This paper proposes a parallel algorithm for MODIS Data based on triangulation network. The input images are divided into several pieces and each CPU processes a piece independently. By implementing the algorithm on a cluster system, the results show that, this parallel algorithm improves the efficiency of geometrical correction greatly.
We present an extension of the klystron simulation code TESLA to model multiple-beam klystrons (MBKs) in which interaction parameters may vary significantly from beam-tunnel to beam-tunnel. In earlier work, the single...
详细信息
We present an extension of the klystron simulation code TESLA to model multiple-beam klystrons (MBKs) in which interaction parameters may vary significantly from beam-tunnel to beam-tunnel. In earlier work, the single-beam code was applied to model the MBK by assuming that all electron beams and beam-tunnels were identical and all electron beams interacted identically with the fields of the resonant cavities, using averaged values of R/Q to represent interaction with each resonant cavity. To overcome the limitations of this approach and to take into account the effects from nonidentical beams and/or beam-tunnels, we have modified the code to use a parallel algorithm for multiple beams. The implementation of the parallel version of TESLA is based on the latest Fortran-95 version of the serial code and uses the message-passing interface library for communication. For testing and verification purposes, the new version of the code is applied to simulate an experimental four-cavity, eight-beam klystron amplifier, which was designed and successfully tested last year at the Naval Research Laboratory. The results of modeling using the new parallel TESLA and their comparison with experimental data are discussed in detail.
Fluid-structure interaction (FSI) problems simultaneously bring together some of the critical aspects associated with both fluid dynamics and structural dynamics. In this research, the simulation of the three-dimensio...
详细信息
Fluid-structure interaction (FSI) problems simultaneously bring together some of the critical aspects associated with both fluid dynamics and structural dynamics. In this research, the simulation of the three-dimensional flexible fluid-filled drum in the crash is achieved through multi-material arbitrary Lagrangian-Eulerian (ALE) finite-element method because of its ability to control mesh geometry independently from geometry. The ALE description is adopted for the fluid domain, whereas for the structural domain the Lagrangian formulation is adopted. The computation of the FSI and the crash contact between the drum and the ground is realized by the penalty-based coupling method. Then the dynamic behaviour of the drum in the crash is analysed and the parallelism is discussed because the computation of the FSI and the crash contact is quite time-consuming. Based on domain decomposition, the recursive coordinate bisection (RCB) is improved according to the time-consuming characteristics of the fluid-filled container in the crash. The results indicate, in comparison with RCB method, the improved recursive coordinate bisection method has improved the speedup and the parallel efficiency.
The fast multipole method (FMM) and smooth particle mesh Ewald (SPME) are well known fast algorithms to evaluate long range electrostatic interactions in molecular dynamics and other fields. FMM is a multi-scale metho...
详细信息
The fast multipole method (FMM) and smooth particle mesh Ewald (SPME) are well known fast algorithms to evaluate long range electrostatic interactions in molecular dynamics and other fields. FMM is a multi-scale method which reduces the computation cost by approximating the potential due to a group of particles at a large distance using few multipole functions. This algorithm scales like O(N) for N particles. SPME algorithm is an O(N In N) method which is based on an interpolation of the Fourier space part of the Ewald sum and evaluating the resulting convolutions using fast Fourier transform (FFT). Those algorithms suffer from relatively poor efficiency on large parallel machines especially for mid-size problems around hundreds of thousands of atoms. A variation of the FMM, called PWA, based on plane wave expansions is presented in this paper. A new parallelization strategy for PWA, which takes advantage of the specific form of this expansion, is described. Its parallel efficiency is compared with SPME through detail time measurements on two different computer clusters. (C) 2008 Elsevier Inc. All rights reserved.
ALBERTA, a sequential adaptive finite-element toolbox, is being used widely in the fields of scientific and engineering computation, especially in the numerical simulation of electromagnetics. But the nature of sequen...
详细信息
ALBERTA, a sequential adaptive finite-element toolbox, is being used widely in the fields of scientific and engineering computation, especially in the numerical simulation of electromagnetics. But the nature of sequentiality has become the bottle-neck while solving large scale problems. So we develop a parallel adaptive finite-element package based on ALBERTA, using ParMETIS and PETSc. The package is able to deal with any problem that ALBERT solved. Furthermore, it is suitable for distributed memory parallel computers including PC clusters. In this paper, we present the implementation of the package in detail, and address several key algorithms and strategies of parallelization. Finally, some numerical experiments are given to show the performance and scalability of our package.
The balance of data and the utilization of resources are essential to distributed spatial database system. The paper presents an efficient parallel spatial query algorithm which takes seriously the organization of spa...
详细信息
The balance of data and the utilization of resources are essential to distributed spatial database system. The paper presents an efficient parallel spatial query algorithm which takes seriously the organization of spatial data into account. The algorithm adopts a balanced spatial data partitioning strategy for distributed spatial databases. According to the characteristics of data partitioning, it builds a packing R-tree as its index. The strategy also considers the problem of computing distribution. By replicating index to every site, each site can access different entry in the same index node at the same time. Based on the organization of spatial data, the algorithm can simultaneously execute query operation at different site in both filtration phase and refinery phase. So it obviously improves spatial query performances. In order to solve multiple paths search problem caused by R-tree index, the algorithm brings in globe stack to buffer temporary index nodes. It settles the difficult problem flexibly in distributed spatial databases. For simplicity, the paper discusses the parallel algorithm in 2-dimensional space. Through the experiments conducting on many real datasets, it shows better performance in various spatial query operations.
With the rapid development of high-speed network technology,the cluster systems have been the main platform of parallel *** of the delay of their high communication,some parallel algorithms of fine grain are not fit t...
详细信息
With the rapid development of high-speed network technology,the cluster systems have been the main platform of parallel *** of the delay of their high communication,some parallel algorithms of fine grain are not fit to run in this environment. Therefore,it is necessary to study their parallel achievements in cluster *** terms of that,this paper aims at the internal parallel of the GMRES(m) method in order to find the solution of the linear equation groups and obtains coarse grain parallel algorithms,and more,we devise the program of this method using *** last,the example expresses that the designing parallel algorithm has much higher speedup in this cluster system.
A parallel randomized support vector machine (PRSVM) and a parallel randomized support vector regression (PRSVR) algorithm based on a randomized sampling technique are proposed in this paper. The proposed PRSVM and PR...
详细信息
A parallel randomized support vector machine (PRSVM) and a parallel randomized support vector regression (PRSVR) algorithm based on a randomized sampling technique are proposed in this paper. The proposed PRSVM and PRSVR have four major advantages over previous methods. (1) We prove that the proposed algorithms achieve an average convergence rate that is so far the fastest bounded convergence rate, among all SVM decomposition training algorithms to the best of our knowledge. The fast average convergence bound is achieved by a unique priority based sampling mechanism. (2) Unlike previous work (Provably fast training algorithm for support vector machines, 2001) the proposed algorithms work for general linear-nonseparable SVM and general non-linear SVR problems. This improvement is achieved by modeling new LP-type problems based on Karush-Kuhn-Tucker optimality conditions. (3) The proposed algorithms are the first parallel version of randomized sampling algorithms for SVM and SVR. Both the analytical convergence bound and the numerical results in a real application show that the proposed algorithm has good scalability. (4) We present demonstrations of the algorithms based on both synthetic data and data obtained from a real word application. Performance comparisons with SVMlight show that the proposed algorithms may be efficiently implemented.
A parallel randomized support vector machine (PRSVM) and a parallel randomized support vector regression (PRSVR) algorithm based on a randomized sampling technique are proposed in this paper. The proposed PRSVM and PR...
详细信息
A parallel randomized support vector machine (PRSVM) and a parallel randomized support vector regression (PRSVR) algorithm based on a randomized sampling technique are proposed in this paper. The proposed PRSVM and PRSVR have four major advantages over previous methods. (1) We prove that the proposed algorithms achieve an average convergence rate that is so far the fastest bounded convergence rate, among all SVM decomposition training algorithms to the best of our knowledge. The fast average convergence bound is achieved by a unique priority based sampling mechanism. (2) Unlike previous work (Provably fast training algorithm for support vector machines, 2001) the proposed algorithms work for general linear-nonseparable SVM and general non-linear SVR problems. This improvement is achieved by modeling new LP-type problems based on Karush-Kuhn-Tucker optimality conditions. (3) The proposed algorithms are the first parallel version of randomized sampling algorithms for SVM and SVR. Both the analytical convergence bound and the numerical results in a real application show that the proposed algorithm has good scalability. (4) We present demonstrations of the algorithms based on both synthetic data and data obtained from a real word application. Performance comparisons with SVMlight show that the proposed algorithms may be efficiently implemented.
In this paper, we present a parallel multilevel TLU preconditioner implemented with OpenMP. We employ METIS partitioning algorithms to decompose the computation into concurrent tasks, which are then scheduled to threa...
详细信息
ISBN:
(纸本)9783540928584
In this paper, we present a parallel multilevel TLU preconditioner implemented with OpenMP. We employ METIS partitioning algorithms to decompose the computation into concurrent tasks, which are then scheduled to threads. Concretely, we combine decompositions which obtain significantly more tasks than processors, and the use of dynamic scheduling strategies in order to reduce the thread's idle time, which it is shown to be the main source of overhead in our parallel algorithm. Experimental results on a shared-memory platform consisting of 1.6 processors report remarkable performance for our approach.
暂无评论