检索结果-内蒙古大学图书馆

A new parallel algorithm of MP2 energy calculations

JOURNAL OF COMPUTATIONAL CHEMISTRY 2006年第4期27卷 407-413页

作者： Ishimura, K Pulay, P Nagase, S Inst Mol Sci Dept Theoret Mol Sci Okazaki Aichi 4448585 Japan Univ Arkansas Dept Chem & Biochem Fayetteville AR 72701 USA

A new parallel algorithm has been developed for second-order Moller-Plesset perturbation theory (MP2) energy calculations. Its main projected applications are for large molecules, for instance, for the calculation of dispersion interaction. Tests on a moderate number of processors (2-16) show that the program has high CPU and parallel efficiency. Timings are presented for two relatively large molecules, taxol (C47H51NO14) and luciferin (C11H8N2O3S2), the former with the 6-31G* and 6-311G** basis sets (1032 and 1484 basis functions, 164 correlated orbitals), and the latter with the aug-cc-pVDZ and aug-cc-pVTZ basis sets (530 and 1198 basis functions, 46 correlated orbitals). An MP2 energy calculation on C130H10 (1970 basis functions, 265 con-elated orbitals) completed in less than 2 h on 128 processors. (c) 2006 Wiley Periodicals, Inc.

关键词： MP2 energy parallel algorithm large molecule

来源：评论

学校读者我要写书评

暂无评论

A fast optimal parallel algorithm for a short addition chain

引用

JOURNAL OF SUPERCOMPUTING 2018年第1期74卷 324-333页

作者： Bahig, Hazem M. Ain Shams Univ Div Comp Sci Dept Math Fac Sci Cairo Egypt Hail Univ Comp Sci & Engn Coll Hail Saudi Arabia

Given a natural number e, an addition chain for e is a finite sequence of numbers having the following properties: (1) the first number is one, (2) every element is the sum of two earlier elements, and (3) the given number occurs at the end of the sequence. We introduce a fast optimal algorithm to generate a chain of short length for the number e of n-bits. The algorithm is based on the right-left binary strategy and barrel shifter circuit. The algorithm uses processors and runs in time under exclusive read exclusive write parallel random access machine.

关键词： Short chain parallel algorithm Binary method parallel random access machine

来源：评论

学校读者我要写书评

暂无评论

A parallel algorithm for the initial screening of space debris collisions prediction using the SGP4/SDP4 models and GPU acceleration

引用

ADVANCES IN SPACE RESEARCH 2017年第9期59卷 2398-2406页

作者： Lin, Mingpei Xu, Ming Fu, Xiaoyu Beihang Univ Sch Astronaut Beijing 100191 Peoples R China

Currently, a tremendous amount of space debris in Earth's orbit imperils operational spacecraft. It is essential to undertake risk assessments of collisions and predict dangerous encounters in space. However, collision predictions for an enormous amount of space debris give rise to large-scale computations. In this paper, a parallel algorithm is established on the Compute Unified Device Architecture (CUDA) platform of NVIDIA Corporation for collision prediction. According to the parallel structure of NVIDIA graphics processors, a block decomposition strategy is adopted in the algorithm. Space debris is divided into batches, and the computation and data transfer operations of adjacent batches overlap. As a consequence, the latency to access shared memory during the entire computing process is significantly reduced, and a higher computing speed is reached. Theoretically, a simulation of collision prediction for space debris of any amount and for any time span can be executed. To verify this algorithm, a simulation example including 1382 pieces of debris, whose operational time scales vary from 1 min to 3 days, is conducted on Tesla C2075 of NVIDIA. The simulation results demonstrate that with the same computational accuracy as that of a CPU, the computing speed of the parallel algorithm on a GPU is 30 times that on a CPU. Based on this algorithm, collision prediction of over 150 Chinese spacecraft for a time span of 3 days can be completed in less than 3 h on a single computer, which meets the timeliness requirement of the initial screening task. Furthermore, the algorithm can be adapted for multiple tasks, including particle filtration, constellation design, and Monte-Carlo simulation of an orbital computation. (C) 2017 COSPAR. Published by Elsevier Ltd. All rights reserved.

关键词： parallel algorithm Space debris Initial screening GPU acceleration CUDA

来源：评论

学校读者我要写书评

暂无评论

Runoff Forecast Model Based on an EEMD-ANN and Meteorological Factors Using a Multicore parallel algorithm

引用

WATER RESOURCES MANAGEMENT 2023年第4期37卷 1539-1555页

作者： Liao, Shengli Wang, Huan Liu, Benxi Ma, Xiangyu Zhou, Binbin Su, Huaying Dalian Univ Technol Inst Hydropower & Hydroinformat Dalian 116024 Peoples R China Yunnan Elect Power Dispatch Control Ctr Yunnan Pow Kunming 650011 Peoples R China Power Dispatching Control Ctr Guizhou Power Grid Guiyang 550000 Peoples R China

Accurate long-term runoff forecasting is crucial for managing and allocating water resources. Due to the complexity and variability of natural runoff, the most difficult problems currently faced by long-term runoff forecasting are the difficulty of model construction, poor prediction accuracy, and time intensive forecasting processes. Therefore, this study proposes a hybrid long-term runoff forecasting framework that uses the antecedent inflow and specific meteorological factors as the inputs, is modeled by ensemble empirical mode decomposition (EEMD) coupled with an artificial neural network (ANN), and computed by a parallel algorithm. First, the framework can transform monthly inflow and meteorological series into stationary signals via EEMD to more comprehensively explore the relationships of the input factors through the ANN. Second, the selected meteorological factors that are closely related to inflow formation can be filtered out by the single correlation coefficient method, which contributes to reducing coupling between input factors, and increases the accuracy of the prediction models. Finally, a multicore parallel algorithm that is easily accessed everywhere and that fully utilizes multiple calculation resources while flexibly contending with various optimization requirements will improve forecasting efficiency. The Xiaowan Hydropower Station (XW) is selected as the study area, and the final results of the study show that (1) the addition of targeted meteorological factors does indeed greatly enhance the performance of the prediction models;(2) the five criteria for evaluating the prediction accuracy show that the EEMD-ANN model is far superior to the prediction performance from the ordinary ANN model when run under the same input conditions;and (3) the optimization time of the 32-core model can be reduced by as much as 25 times, which significantly saves time during the forecast process.

关键词： Long-term runoff forecast parallel algorithm Ensemble empirical mode decomposition ANN Filtered meteorological factors

来源：评论

学校读者我要写书评

暂无评论

A 10000-Image-per-Second parallel algorithm for Real-Time Detection of MARFEs on JET

引用

IEEE TRANSACTIONS ON PLASMA SCIENCE 2013年第2期41卷 341-349页

作者： de Albuquerque, Marcio Portes Murari, Andrea Giovani, M. Alves, Nilton, Jr. de Albuquerque, Marcelo Portes Ctr Brasileiro Pesquisas Fis BR-22299180 Rio De Janeiro Brazil Italian Natl Agcy New Technol Energy & Sustainabl European Atom Energy Community Consorzio Reversed Field EXpt Assoc I-35127 Padua Italy

This paper presents a very high-speed image processing algorithm applied to multi-faceted asymmetric radiation from the edge (MARFE) detection on the Joint European Torus. The algorithm was built in serial and parallel versions and written in C/C+ using OpenCV, cvBlob, and LibSVM libraries. The code implemented was characterized by its accuracy and run-time performance. The final result of the parallel version achieves a correct detection rate of 97.6% for MARFE identification and an image processing rate of more than 10 000 frame per second. The parallel version divides the image processing chain into two groups and seven tasks. One group is responsible for Background Image Estimation and Image Binarization modules, and the other is responsible for region Feature Extraction and Pattern Classification. At the same time and to maximize the workload distribution, the parallel code uses data parallelism and pipeline strategies for these two groups, respectively. A master thread is responsible for opening, signaling, and transferring images between both groups. The algorithm has been tested in a dedicated Intel symmetric-multiprocessing computer architecture with a Linux operating system.

关键词： Image processing multi-faceted asymmetric radiation from the edge detection multi-faceted asymmetric radiation from the edge identification parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

AN OPTIMAL parallel algorithm FOR GENERATING PERMUTATIONS IN MINIMAL CHANGE ORDER

引用

parallel COMPUTING 1994年第3期20卷 353-361页

作者： TSAY, JC LEE, WP Institute of Computer Science and Information Engineering College of Engineering National Chiao Tung University Hsinchu Taiwan 30050

Permutation generation is an important problem in combinatorial computing. In this paper we present an optimal parallel algorithm to generate all N! permutations of N objects. The algorithm is designed to be executed on a very simple computation model that is a linear array with N identical processors. Because of the simplicity and regularity of the processors, the model is very suitable for VLSI implementation. Another advantageous characteristic of this design is that it can generate all the permutations in minimal change order.

关键词： MINIMAL CHANGE ORDER parallel algorithm PERMUTATION GENERATION VLSI

来源：评论

学校读者我要写书评

暂无评论

A parallel algorithm for improving the maximal property of Poisson disk sampling

引用

COMPUTER-AIDED DESIGN 2014年第1期46卷 37-44页

作者： Ying, Xiang Li, Zhenhua He, Ying Nanyang Technol Univ Sch Comp Engn Singapore 639798 Singapore

This paper presents a simple yet effective algorithm to improve an arbitrary Poisson disk sampling to reach the maximal property, i.e., no more Poisson disk can be inserted. Taking a non-maximal Poisson disk sampling as input, our algorithm efficiently detects the regions allowing additional samples and then generates Poisson disks in these regions. The key idea is to convert the complicated plane or space searching problem into a simple searching on circles or spheres, which is one dimensional lower than the original sampling domain. Our algorithm is memory efficient and flexible, which generates maximal Poisson disk sampling in an arbitrary 2D polygon or 3D polyhedron. Moreover, our parallel algorithm can be extended from the Euclidean space to curved surfaces in an intrinsic manner. Thanks to its parallel structure, our method can be implemented easily on modern graphics hardware. We have observed significance performance improvement compared to the existing techniques. (C) 2013 Elsevier Ltd. All rights reserved.

关键词： Poisson disk sampling Maximal sampling parallel algorithm GPU Exponential map

来源：评论

学校读者我要写书评

暂无评论

A parallel algorithm for computing Voronoi diagram of a set of spheres using restricted lower envelope approach and topology matching

引用

COMPUTERS & GRAPHICS-UK 2022年 106卷 210-221页

作者： Mukundan, Manoj Kumar Thayyil, Safeer Babu Muthuganapathy, Ramanathan Indian Inst Technol Madras Dept Engn Design Adv Geometr Comp Lab Chennai India Govt Engn Coll Palakkad Dept Informat Technol Palakkad Kerala India Rajiv Gandhi Inst Technol Dept Mech Engn Kottayam Kerala India

We present a parallel algorithm for computing the Voronoi diagram of a set of spheres, S in R3. The spheres have varying radii and do not intersect. We compute each Voronoi cell independently using a two-stage iterative procedure, assuming the input spheres are in general position. In the first stage, an initial Voronoi cell for a sphere si is computed using an iterative lower envelope approach restricted to a subset of spheres Li subset of S. This helps to avoid defining the bisectors between all pairs of input spheres and develop a distributed memory parallel algorithm. We use the Delaunay graph of sample points from the input spheres to select the subset Li for computing each Voronoi cell. In the second stage, Voronoi cells obtained from the first stage are matched for updating the subsets. If additional spheres are added to a subset Li, the correctness of the computed vertices is verified with the bisectors of spheres newly added to Li. Results and performance of the algorithm show robustness and speed of the algorithm in handling a large set of spheres.

关键词： Voronoi diagram Topology matching Delaunay graph Touching sphere Lower envelope parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

pSIN: A scalable, Parallel algorithm for Seismic INterferometry of large-N ambient-noise data

引用

COMPUTERS & GEOSCIENCES 2016年 93卷 88-95页

作者： Chen, Po Taylor, Nicholas J. Dueker, Ken G. Keifer, Ian S. Wilson, Andra K. McGuffy, Casey L. Novitsky, Christopher G. Spears, Alec J. Holbrook, W. Steven Univ Wyoming Dept Geol & Geophys Laramie WY 82071 USA

Seismic interferometry is a technique for extracting deterministic signals (i.e., ambient-noise Green's functions) from recordings of ambient-noise wavefields through cross-correlation and other related signal processing techniques. The extracted ambient-noise Green's functions can be used in ambient noise tomography for constructing seismic structure models of the Earth's interior. The amount of calculations involved in the seismic interferometry procedure can be significant, especially for ambient noise datasets collected by large seismic sensor arrays (i.e., "large-N" data). We present an efficient parallel algorithm, named pSIN (parallel Seismic INterferometry), for solving seismic interferometry problems on conventional distributed-memory computer clusters. The design of the algorithm is based on a two-dimensional partition of the ambient-noise data recorded by a seismic sensor array. We pay special attention to the balance of the computational load, inter-process communication overhead and memory usage across all MPI processes and we minimize the total number of I/O operations. We have tested the algorithm using a real ambient-noise dataset and obtained a significant amount of savings in processing time. Scaling tests have shown excellent strong scalability from 80 cores to over 2000 cores. (C) 2016 Elsevier Ltd. All rights reserved.

关键词： Seismic interferometry Ambient-noise parallel algorithm Message-passing interface

来源：评论

学校读者我要写书评

暂无评论

Efficient parallel algorithm for the two-dimensional diffusion equation subject to specification of mass

引用

INTERNATIONAL JOURNAL OF COMPUTER MATHEMATICS 1997年第1-2期64卷 153-163页

作者： Gumel, AB Ang, WT Twizell, EH UNIV MALAYSIA SARAWAK FAC ENGNKOTA SAMARAHAN 94300SARAWAKMALAYSIA BRUNEL UNIV DEPT MATH & STATUXBRIDGE UB8 3PHMIDDXENGLAND

An efficient L-0-stable parallel algorithm is developed for the two-dimensional diffusion equation with non-local time-dependent boundary conditions. The algorithm is based on subdiagonal Pade approximation to the matrix exponentials arising from the use of the method of lines and may be implemented on a parallel architecture using two processors running concurrently with each processor employing the use of tridiagonal solvers at every time-step. The algorithm is tested on two model problems from the literature for which discontinuities between initial and boundary conditions exist. The CPU times together with the associated error estimates are compared.

关键词： Padi approximant parallel algorithm L-o-stability

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：