A two-level hierarchical parallelization scheme including the second-order Moller-Plesset perturbation (MP2) theory in the divide-and-conquer method is presented. The scheme is a combination of coarse-grain paralleliz...
详细信息
A two-level hierarchical parallelization scheme including the second-order Moller-Plesset perturbation (MP2) theory in the divide-and-conquer method is presented. The scheme is a combination of coarse-grain parallelization assigning each subsystem to a group of processors, with fine-grain parallelization, where the computational tasks for evaluating MP2 correlation energy of the assigned subsystem are distributed among processors in the group. Test calculations demonstrate that the present scheme shows high parallel efficiency and makes MP2 calculations practical for very large molecules. (C) 2011 Wiley Periodicals, Inc. J Comput Chem 32: 2756-2764, 2011
In todays health care, imaging plays an important role throughout the entire clinical process from diagnostics and treatment planning to surgical procedures and follow up studies. Since most imaging modalities have go...
详细信息
In todays health care, imaging plays an important role throughout the entire clinical process from diagnostics and treatment planning to surgical procedures and follow up studies. Since most imaging modalities have gone directly digital, with continually increasing resolution, medical image processing has to face the challenges arising from large data volumes. In this paper, we discuss Kilo-to Terabyte challenges regarding (i) medical image management and image data mining, (ii) bioimaging, (iii) virtual reality in medical visualizations and (iv) neuroimaging. Due to the increasing amount of data, image processing and visualization algorithms have to be adjusted. Scalable algorithms and advanced parallelization techniques using graphical processing units have been developed. They are summarized in this paper. While such techniques are coping with the Kilo-to Terabyte challenge, the Petabyte level is already looming on the horizon. For this reason, medical image processing remains a vital field of research.
A new multi-objective optimizer based on swarm intelligence is presented in this article. A distinctive feature of the proposed particle swarm optimizer (PSO) is the utilization of only social components, which are ba...
详细信息
A new multi-objective optimizer based on swarm intelligence is presented in this article. A distinctive feature of the proposed particle swarm optimizer (PSO) is the utilization of only social components, which are based on global guides, for the exploration and exploitation of the search space. Mutation and elitism are also employed in order to improve the effectiveness of the PSO. The algorithmic parameters are controlled via an on-line adaptive scheme. The algorithm is further developed to co-evolve multiple swarms. The investigation of various multi-objective optimization problems reveals that the proposed PSO is able to converge fast and in a robust manner towards the true Pareto-optimal front. Comparisons with results obtained from other multi-objective optimizers are presented. A parametric investigation is performed in order to exploit the potential of the proposed co-evolutionary algorithm for parallelization. The results obtained from a hydrofoil design optimization problem demonstrate near-linear speedup and high parallel efficiency.
In this paper, we present an efficient parallel multilevel fast multipole algorithm (MLFMA) for three dimensional scattering problems of large-scale objects. Several parallel implantation tricks are discussed and anal...
详细信息
In this paper, we present an efficient parallel multilevel fast multipole algorithm (MLFMA) for three dimensional scattering problems of large-scale objects. Several parallel implantation tricks are discussed and analyzed. Firstly, we propose a method that reduces truncation number without loss of accuracy. Furthermore, a matrix-sliced technique, allowing data in the memory transforming into the hard disk, is applied here, in order to solve the problem of extremely large targets. Finally, a transition level scheme is adopted to improve the parallel efficiency. We demonstrate the capability of our code by considering a sphere of 220 lambda discretized with 48,879,411 unknowns and a square patch of 200 lambda discretized with 10,150,143 unknowns. The bi-static RCS is calculated within 41.5 GB memory for the first object and 14.7 GB for the second one.
The integral image can be used to quickly complete common pixel-level operations in the regular region of the grey-level image. So it has been widely used in the field of computer vision and pattern recognition. In th...
详细信息
The integral image can be used to quickly complete common pixel-level operations in the regular region of the grey-level image. So it has been widely used in the field of computer vision and pattern recognition. In this paper, we firstly present an intuitive parallel method to compute the integral image. Then based on the intuitive method, a two-stage method based on the binary tree is introduced. In each stage of the algorithm, we do a firstly top-down and secondly bottom-up traversal over the tree. Finally, we analyze the case of large-scale grey-level image and optimize the computation based on the CUDA architecture. We have done the experiment in the consumer-level PC hardware which shows that the GPU-based algorithm outperforms the corresponded CPU-based algorithm in terms of speed in case of large-scale images.
In X-ray computed tomography (CT) the X rays are used to obtain the projection data needed to generate an image of the inside of an object. The image can be generated with different techniques. Algebraic methods a...
详细信息
In X-ray computed tomography (CT) the X rays are used to obtain the projection data needed to generate an image of the inside of an object. The image can be generated with different techniques. Algebraic methods are more suitable for the reconstruction of images with high contrast and precision in noisy conditions and from a small number of projections. Their use may be important in portable scanners for their functionality in emergency situations. However, in practice, these methods are not widely used due to the high computational cost of their implementation. In this work we analyze and propose the usage of the PETSc library for the optimal usage of a system in the parallel reconstruction of images. Also, the quality comparison of the images reconstructed with both methods, analytical Filtered Back projection (FBP) and iterative LSQR, has been performed.
According to the real geology background of Tarim foreland basin we build 3D seismic data volume of Tarim area by using 3D arbitrary difference precise integration (ADPI) algorithm. It is very beneficial for the p...
详细信息
According to the real geology background of Tarim foreland basin we build 3D seismic data volume of Tarim area by using 3D arbitrary difference precise integration (ADPI) algorithm. It is very beneficial for the processing and explanation of 3D seismic material of Tarim area. Compared with conventional differential method, the 3D ADPI algorithm greatly improves the precision by using local integral semianalytical method in time domain to get the recursion operator of wave equations. And we adopt stable factor constraints, thus the stability of calculation gets much better. By using an improved adaptive absorbing boundary and the parallelization of serial program, the time consuming of 3D forward modeling is greatly reduced. In the research we gathered 300 shots' Omni-directional seismic data volume. The whole data volume approximates 2T. Compare the geology model with actual seismic records we find 3D ADPI forward modeling can accurately show the structure and layers information of geology model. In complex region it can describe geology structure and seismic physical parameters such as amplitude, frequency, phase and so on.
Through the research of the parallel computational model based on the principal and subordinate mode and the basic theory of Gmres algorithm in Krylov subspace, this essay raises a improvement parallel Predict-Correct...
详细信息
Through the research of the parallel computational model based on the principal and subordinate mode and the basic theory of Gmres algorithm in Krylov subspace, this essay raises a improvement parallel Predict-Correct Gmres(m) algorithm which posses Predict-Correct pattern, and shows the computing examples for linear equations. After the comparison with the result from the new parallel Predict-Correct GMRES(m) algorithm, at last one application is given for thin plate structures, it shows that this designed parallel algorithm can reduce the iteration frequency, shorten the computing time and obtain
In this paper, parallelisable Simulated Annealing with Genetic Enhancement (SAwGE) algorithm is presented and applied to Permutation Flowshop Scheduling Problem with total flowtime criterion. This problem is proved to...
详细信息
In this paper, parallelisable Simulated Annealing with Genetic Enhancement (SAwGE) algorithm is presented and applied to Permutation Flowshop Scheduling Problem with total flowtime criterion. This problem is proved to be NP-complete in a strong sense for more than one machine. SAwGE is based on a Clustering algorithm for Simulated Annealing (SA), but introduces a new mechanism for dynamic SA parameters adjustment, based on genetic algorithms. Computational experiments, based on 120 benchmark datasets by Taillard, show that SAwGE outperforms other heuristics and metaheuristics presented recently in literature. Moreover SAwGE obtains 118 best solutions, including 81 newly discovered ones. (C) 2010 Elsevier Ltd. All rights reserved.
A new parallel algorithm has been developed for calculating the analytic energy derivatives of full accuracy second order Moller-Plesset perturbation theory (MP2). Its main projected application is the optimization of...
详细信息
A new parallel algorithm has been developed for calculating the analytic energy derivatives of full accuracy second order Moller-Plesset perturbation theory (MP2). Its main projected application is the optimization of geometries of large molecules, in which noncovalent interactions play a significant role. The algorithm is based on the two-step MP2 energy calculation algorithm developed recently and implemented into the quantum chemistry program, GAMESS. Timings are presented for test calculations on taxol (C47H51NO14) With the 6-31G and 6-31G(d) basis sets (660 and 1032 basis functions, 328 correlated electrons) and luciferin (C11H8N2O3S2) with aug-cc-pVDZ and aug-cc-pVTZ (530 and 1198 basis functions, 92 correlated electrons). The taxol 6-31G(d) calculations are also performed with up to 80 CPU cores. The results demonstrate the high parallel efficiency of the program. (c) 2007 Wiley Periodicals, Inc.
暂无评论