We examine algorithmic aspects of M. Celia's alternating-direction scheme for finite-element collocation, especially as implemented for the two-dimensional advection-diffusion equation governing solute transport i...
详细信息
We examine algorithmic aspects of M. Celia's alternating-direction scheme for finite-element collocation, especially as implemented for the two-dimensional advection-diffusion equation governing solute transport in groundwater. Collocation offers savings over other finite-element techniques by obviating the numerical quadrature and global matrix assembly procedures ordinarily needed in Galerkin formulations. The alternating-direction approach offers further saving in storage and serial runtime and, significantly, yields highly parallel algorithms involving the solution of problems having only one-dimensional structure. We explore this parallelism.
Computation of approximation is a critical step for applying rough sets methodologies in knowledge discovery and data mining. As an extension of classic rough sets theory, Dominance-based Rough Sets Approach (DRSA) ca...
详细信息
Computation of approximation is a critical step for applying rough sets methodologies in knowledge discovery and data mining. As an extension of classic rough sets theory, Dominance-based Rough Sets Approach (DRSA) can process information with preference-ordered attribute domain and then can be applied in multi-criteria decision analysis and other related works. Efficiently computing approximations is helpful for reducing the time of making decisions based on DRSA. parallel computing is an effective way to speed up the process of computation. In this paper, several strategies of decomposition and composition of granules in DRSA are proposed for computing approximations in parallel and the corresponding parallel algorithm is designed. A numerical example is employed to validate the feasibility of these strategies. The experimental evaluations on a multi-core environment showed that the parallel algorithm can obviously reduce the time of computing approximations in DRSA. (C) 2015 Elsevier B.V. All rights reserved.
The Spectral Statistical Interpolation (SSI) analysis system of NCEP is used to assimilate meteorological data from the Global Positioning Satellite System (GPS/MET) refraction angles with the variational technique. V...
详细信息
The Spectral Statistical Interpolation (SSI) analysis system of NCEP is used to assimilate meteorological data from the Global Positioning Satellite System (GPS/MET) refraction angles with the variational technique. Verified by radiosonde, including GPS/MET observations into the analysis makes an overall improvement to the analysis variables of temperature, winds, and water vapor. However, the variational model with the ray-tracing method is quite expensive for numerical weather prediction and climate research. For example, about 4 000 GPS/MET refraction angles need to be assimilated to produce an ideal global analysis. Just one iteration of minimization will take more than 24 hours CPU time on the NCEP's Cray C90 computer. Although efforts have been taken to reduce the computational cost, it is still prohibitive for operational data assimilation. In this paper, a parallel version of the three-dimensional variational data assimilation model of GPS/MET occultation measurement suitable for massive parallel processors architectures is developed. The divide-and-conquer strategy is used to achieve parallelism and is implemented by message passing. The authors present the principles for the code's design and examine the performance on the state-of-the-art parallel computers in China. The results show that this parallel model scales favorably as the number of processors is increased. With the Memory-IO technique implemented by the author, the wall clock time per iteration used for assimilating 1420 refraction angles is reduced from 45 s to 12 s using 1420 processors. This suggests that the new parallelized code has the potential to be useful in numerical weather prediction (NWP) and climate studies.
In the process of measurements such as optical interferometry and fringe projection, an important stage is fringe pattern analysis. Many advanced fringe analysis algorithms have been proposed including regularized pha...
详细信息
In the process of measurements such as optical interferometry and fringe projection, an important stage is fringe pattern analysis. Many advanced fringe analysis algorithms have been proposed including regularized phase tracking (RPT), partial differential equation based methods, wavelet transform, Wigner-Ville distribution, and windowed Fourier transform. However, most of those algorithms are computationally expensive. MATLAB (R) is a general algorithm development environment with powerful image processing and other supporting toolboxes. It is also commonly used in photomechanical data analysis. With rapid development of multicore CPU technique, using multicore computer and MATLAB (R) is an intuitive and simple way to speed up the algorithms for fringe pattern analysis. The paper introduces two acceleration approaches for fringe pattern processing. The first approach is task parallelism using multicore computer and MATLAB (R) parallel computing toolbox. Since some algorithms are embarrassing problems, our first approach makes use of this characteristic to parallelize these algorithms. For this approach, parallelized windowed Fourier filtering (WFF) algorithm serves as an example to show how parallel computing toolbox accelerates the algorithm. Second, data parallelism using multicore computer and MATLAB (R) parallel computing toolbox is proposed. A high level parallel wrapping structure is designed, which can be used for speeding up any local processing algorithms. WFF, windowed Fourier ridges (WFR), and median filter are used as examples to illustrate the speedup. At last, the results show that the parallel versions of former sequential algorithm with simple modifications achieve the speedup up to 6.6 times. (C) 2009 Elsevier Ltd. All rights reserved.
The research presented in the paper deals with explicit nonlinear finite element calculation with domain decomposition for vehicle crashworthiness simulation. This is very important for vehicle design. parallel comput...
详细信息
The research presented in the paper deals with explicit nonlinear finite element calculation with domain decomposition for vehicle crashworthiness simulation. This is very important for vehicle design. parallel computing is an efficient solution method to speedup and enhance the solving ability of large-scale numerical simulation. In this paper, a cost-effective domain decomposition method based on contact balance is presented, and the algorithm flowchart including contact computing is provided, and the parallel computing process and communication overhead are analyzed. Furthermore, scalability of the parallel computing method on different hardware platforms, the SGI Onyx 3800 and the Shen Wei cluster, is studied. Finally, the effect of different domain decomposition strategy on vehicle crashworthiness simulation computing efficiency is presented. To end users, the research results should provide a reference for vehicle design and choosing appropriate hardware platform and computing software.
This research presents preliminary results generated from the semantic retrieval research component of the illinois Digital Library Initiative (DLI) project. Using a variation of the automatic thesaurus generation tec...
详细信息
This research presents preliminary results generated from the semantic retrieval research component of the illinois Digital Library Initiative (DLI) project. Using a variation of the automatic thesaurus generation techniques, to which we refer as the concept space approach, we aimed to create graphs of domain-specific concepts (terms) and their weighted co-occurrence relationships for all major engineering domains. Merging these concept spaces and providing traversal paths across:different concept spaces could potentially help alleviate the vocabulary (difference) problem evident in large-scale information retrieval. We have experimented previously with such a technique for a smaller molecular biology domain (Worm Community System, with 10+ MBs of document collection) with encouraging results. In order to address the scalability issue related to large-scale information retrieval and analysis for the current Illinois DLI project, we recently conducted experiments using the concept space approach on parallel supercomputers. Our test collection included 2+ GBs of computer science and electrical engineering abstracts extracted from the INSPEC database. The concept space approach called for extensive textual and statistical analysis (a form of knowledge discovery) based on automatic indexing and cooccurrence analysis algorithms, both previously tested in the biology domain. Initial testing results using a 512-node CM-5 and a 16-processor SGI Power Challenge were promising. Power Challenge was later selected to create a comprehensive computer engineering concept space of about 270,000 terms and 4,000,000+ links using 24.5 hours of CPU time. Our system evaluation involving 12 knowledgeable subjects revealed that the automatically-created computer engineering concept space generated significantly higher concept recall than the human-generated INSPEC computer engineering thesaurus. However, the INSPEC was more precise than the automatic concept space. Our current work mainly
The parallelization of the diagonalization step of the COLUMBUS MRSDCI program system is reported. A coarse grain algorithm has been developed by means of a segmentation of the trial and resulting update vectors of th...
详细信息
The parallelization of the diagonalization step of the COLUMBUS MRSDCI program system is reported. A coarse grain algorithm has been developed by means of a segmentation of the trial and resulting update vectors of the iterative Davidson scheme. Message passing based on the TCGMSG toolkit and the global array (GA) tools are used. The latter program system allows an asynchronous access to data structures in the spirit of shared memory. The importance of portable facilities like GA going beyond message passing is stressed for quantum chemical methods and benchmark result for the Intel Touchstone Delta are given.
The vertex solution for estimation on the static displacement bounds of structures with uncertain-but-bounded parameters is studied in this paper. For the linear static problem, when there are uncertain interval param...
详细信息
The vertex solution for estimation on the static displacement bounds of structures with uncertain-but-bounded parameters is studied in this paper. For the linear static problem, when there are uncertain interval parameters in the stiffness matrix and the vector of applied forces, the static response may be an interval. Based on the interval operations, the interval solution obtained by the vertex solution is more accurate and more credible than other methods (such as the perturbation method). However, the vertex solution method by traditional serial computing usually needs large computational efforts, especially for large structures. In order to avoid its disadvantages of large calculation and much runtime, its parallel computing which can be used in large-scale computing is presented in this paper. Two kinds of parallel computing algorithms are proposed based on the vertex solution. The parallel computing will solve many interval problems which cannot be resolved by traditional interval analysis methods.
The use of parallel computing in the finite element analysis of microwave heating applicators is discussed. Numerical results for a multiple feed cavity at 896 MHz and a cavity with a mode stirrer at 2.45 GHz are pres...
详细信息
The use of parallel computing in the finite element analysis of microwave heating applicators is discussed. Numerical results for a multiple feed cavity at 896 MHz and a cavity with a mode stirrer at 2.45 GHz are presented, and it is shown that for the two structures parallelism is most effectively introduced at different levels in the analysis.
In this work a computational procedure for two-scale topology optimization problem using parallel computing techniques is developed. The goal is to obtain simultaneously the best structure and material, minimizing str...
详细信息
In this work a computational procedure for two-scale topology optimization problem using parallel computing techniques is developed. The goal is to obtain simultaneously the best structure and material, minimizing structural compliance. An algorithmic strategy is presented in a suitable way for parallelization. In terms of parallel computing facilities, an IBM Cluster 1350 is used comprising 70 computing nodes each with two dual core processors, for a total of 280 cores. Scalability studies are performed with mechanical structures of low/moderate dimensions. Finally the applicability of the proposed methodology is demonstrated solving a grand challenge problem that is the simulation of trabecular bone adaptation. (C) 2010 Civil-Comp Ltd and Elsevier Ltd. All rights reserved.
暂无评论