检索结果-内蒙古大学图书馆

A spatial decomposition parallel algorithm for a concurrent atomistic-continuum simulator and its preliminary applications

引用

COMPUTATIONAL MATERIALS SCIENCE 2018年 144卷 1-10页

作者： Chen, Hao Xu, Shuozhi Li, Weixuan Ji, Rigelesaiyin Thanh Phan Xiong, Liming Iowa State Univ Dept Aerosp Engn Ames IA 50010 USA Univ Calif Santa Barbara Calif NanoSyst Inst Santa Barbara CA 93106 USA Univ Florida Dept Mech & Aerosp Engn Gainesville FL 32608 USA

This paper presents the development of a spatial decomposition parallel algorithm and its implementation into a concurrent atomistic-continuum (CAC) method simulator for multiscale modeling of dislocations in metallic materials. The scalability and parallel efficiency of the parallelized CAC are tested using up to 512 processors. With a modest computational resource, a single crystalline f.c.c. sample containing 10: 6 billion atoms is modeled using only 4;809;108 finite elements in a CAC model at a fraction of the cost of full molecular dynamics (MD). The simulation demonstrates a nearly ideal scalability of the newly parallelized CAC simulator. The parallel efficiency of the newly parallelized CAC is shown to be higher than 90% when using 512 processors in the high performance computing cluster at Iowa State University. This parallel efficiency is comparable to the state-of-the-art atomistic simulator. Moreover, the newly parallelized CAC simulator employing a uniform coarse mesh is capable of capturing important atomistic features of dislocations, including dislocation nucleation, migration, stacking faults as well as the formation of Lomer-Cottrell locks, in a billion-atom system. The spatial decomposition-based parallelization algorithm developed in this work is general and can be transferable to many other existing concurrent multiscale simulation tools. Published by Elsevier B.V.

关键词： Multiscale Concurrent atomic continuum simulation parallel algorithm Efficiency

来源：评论

学校读者我要写书评

暂无评论

A new constant-time parallel algorithm for merging

引用

JOURNAL OF SUPERCOMPUTING 2019年第2期75卷 968-983页

作者： Bahig, Hazem M. Ain Shams Univ Fac Sci Dept Math Div Comp Sci Cairo 11566 Egypt Hail Univ Comp Sci & Engn Coll Hail Saudi Arabia

Merging is the process of constructing a sorted array C from two sorted arrays A and B of lengths nA and nB, respectively, such that array C contains the elements of arrays A and B. The problem is a fundamental subroutine in many applications such as tree reconstruction, database design, and sorting. In this paper, we present a constant-time integer merging algorithm on a shared memory model with a concurrent read operation. No constant-time algorithm for integer merging on this model was designed previously. The algorithm, which is based on cross-rank and address strategy, works when the elements of the inputs belong to the integer domain. The presented algorithm has the following advantages:- (1) running in constant time;(2) stable;(3) simple;(4) optimal when the domain of integers is less than or equal to the size of the inputs and the number of processors is equal to size of the inputs;and (5) deterministic.

关键词： Integer merging parallel algorithm Shared memory Concurrent read Constant-time algorithm

来源：评论

学校读者我要写书评

暂无评论

A FAST AND SIMPLE RANDOMIZED parallel algorithm FOR MAXIMAL MATCHING

引用

INFORMATION PROCESSING LETTERS 1986年第2期22卷 77-80页

作者： ISRAELI, A ITAI, A TECHNION ISRAEL INST TECHNOL DEPT COMP SCIIL-32000 HAIFAISRAEL

A simple parallel randomized algorithm to find a maximal matching is developed. It is randomized in that it uses randomly chosen numbers. Its characteristics and efficiency do not depend upon the assumption that the input graph is random. A basic operation of the algorithm is a random choice operation (RCO) involving a nonzero boolean vector. The expected time complexity of the algorithm is proven, leading to a trivial implementation of the algorithm on an EREW-PRAM. It is then shown how to perform RCO in the expected time on a CRCW-PRAM, which yields the promised complexity. The expected time is independent of the input graph's structure; this improves the best known deterministic algorithm.

关键词： Maximal matching parallel algorithm randomized algorithm

来源：评论

学校读者我要写书评

暂无评论

A TIME-OPTIMAL parallel algorithm FOR 3-DIMENSIONAL CONVEX HULLS

引用

algorithmICA 1995年第2期14卷 169-182页

作者： AMATO, NM PREPARATA, FP BROWN UNIV DEPT COMP SCIPROVIDENCERI 02912

In this paper we present an O(1/alpha log n)-time parallel algorithm for computing the convex hull of n points in R(3). This algorithm uses O(n(1+alpha)) processors on a CREW PRAM, for any constant 0parallel algorithms proposed for this problem use time at least O(log(2) n). In addition, the algorithm presented here is the first parallel algorithm for the three-dimensional convex hull problem that is not based on the serial divide-and-conquer algorithm of Preparata and Hong, whose crucial operation is the merging of the convex hulls of two linearly separated point sets. The contributions of this paper are therefore (i) an O(log n)-time parallel algorithm for the three-dimensional convex hull problem, and (ii) a parallel algorithm for this problem that does not follow the traditional paradigm.

关键词： CONVEX HULL 3-DIMENSIONAL parallel algorithm PRAM TIME-OPTIMAL DETERMINISTIC

来源：评论

学校读者我要写书评

暂无评论

Hybrid swarm intelligent parallel algorithm research based on multi-core clusters

引用

MICROPROCESSORS AND MICROSYSTEMS 2016年第PartA期47卷 151-160页

作者： Li, Wenjing Bi, Yingzhou Zhu, Xiaofeng Yuan, Chang-an Zhang, Xiang-bo Guangxi Teachers Educ Univ Sci Comp & Intelligent Informat Proc GuangXi High Nanning 530023 Guangxi Peoples R China Guangxi Teachers Educ Univ Coll Comp & Informat Engn Nanning 530023 Guangxi Peoples R China Univ N Carolina Dept Radiol Chapel Hill NC USA Univ N Carolina BRIC Chapel Hill NC USA

In order to solve poor fine searching capacity of artificial fish swarm algorithm and artificial bee colony swarm algorithm in late state to result in insufficient local optimization, hybrid swarm intelligent parallel algorithm research based on multi-core clusters is proposed;Then, reverse learning mechanism is introduced in early stage of algorithm, initialized swarms are evenly distributed, and swarms are randomly divided into two groups to make interactive learning strategy accelerates rate of convergence, and basic artificial fish swarm algorithm and artificial bee colony swarm algorithm are used to make global searching. In late stage of algorithm, niches artificial fish swarm algorithm and Random Perturbation Artificial Bee Colony are used to make local fine searching to the solution obtained in early stage;On this basis, MPI+OpenMP+STM parallel programming model based on multi-core clusters is established for parallel design and analysis. Finally, stimulation experiment indicates optimizing efficiency of this algorithm is higher than single artificial fish swarm algorithm and artificial bee colony swarm algorithm. (C) 2016 Elsevier B.V. All rights reserved.

关键词： Artificial fish swarm algorithm Artificial bee colony swarm algorithm Optimization Hybrid swarm intelligent algorithm parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

Highly efficient parallel algorithm for finite difference solution to Navier-Stoke's equation on a hypercube

引用

APPLIED MATHEMATICS AND COMPUTATION 2002年第2-3期130卷 311-316页

作者： Katti, CP Srivastava, DK Sivaloganathan, S Univ Waterloo Dept Appl Math Waterloo ON N2L 3G1 Canada Jawaharlal Nehru Univ Sch Comp & Syst Sci New Delhi 110067 India

It has been shown in [Nuclear Science and Engineering 93 (1986) 6799] that the finite difference discretization of Navier-Stoke's equation leads to the solution of N x N system written in the matrix form as My = B, where M is a quasi-tridiagonal having non-zero elements at the top right and bottom left corners. We present an efficient parallel algorithm on a p-processor hypercube implemented in two phases. In phase I a generalization of an algorithm due to Kowalik [High Speed Computation, Springer, New York] is developed which decomposes the above matrix system into smaller quasi-tridiagonal (p + 1) x (p + 1) subsystem, which is then solved in Phase II using an odd-even reduction method. (C) 2002 Elsevier Science Inc. All rights reserved.

关键词： parallel algorithm Navier-Stokes equation hypercube finite difference

来源：评论

学校读者我要写书评

暂无评论

An efficient parallel algorithm for the coupling of global climate models and regional climate models on a large-scale multi-core cluster

引用

JOURNAL OF SUPERCOMPUTING 2018年第8期74卷 3999-4018页

作者： Wang, Yuzhu Jiang, Jinrong Zhang, Junqiang He, Juanxiong Zhang, He Chi, Xuebin Yue, Tianxiang China Univ Geosci Sch Informat Engn Beijing 100083 Peoples R China China Univ Geosci Hubei Key Lab Intelligent Geoinformat Proc Wuhan 430074 Hubei Peoples R China Chinese Acad Sci Comp Network Informat Ctr Beijing 100190 Peoples R China State Key Lab Math Engn & Adv Comp Wuxi 214125 Peoples R China China Univ Geosci Sch Comp Sci Wuhan 430074 Hubei Peoples R China Chinese Acad Sci Inst Atmospher Phys Beijing 100029 Peoples R China Chinese Acad Sci Inst Geog Sci & Nat Resources Res Beijing 100101 Peoples R China

High-performance computing for climate models has always been an interesting research area. It is valuable to nest a regional climate model within a global climate model, but large-scale simulation of the nesting or coupling severely challenges to the development of efficient parallel algorithms that fit well into multi-core clusters. This paper first presents research on the coupling of the Institute of Atmospheric Physics of Chinese Academy of Sciences Atmospheric General Circulation Model version 4.0 and the Weather Research and Forecasting model, then proposes an efficient parallel algorithm of the coupling. The algorithm includes initialization of input data, decomposition of computing grid and processes, parallel computing of component models, and data exchange by a coupler. By calling some subroutines of the Model Coupling Toolkit, the parallelization of the proposed algorithm is implemented. Experiments show that the parallel algorithm is very effective and scalable. The parallel efficiency of the algorithm on 1,024 CPU cores can reach up to 70%. Moreover, its parallel efficiency with respect to weak scalability is 72.56% on a multi-core cluster.

关键词： High-performance computing parallel algorithm Scalability Coupler Earth system model

来源：评论

学校读者我要写书评

暂无评论

A parallel algorithm of Image Mean Filtering Based on OpenCL

引用

IEEE ACCESS 2021年 9卷 65001-65016页

作者： Xiao, Han Guo, Baoyun Zhang, Hongyan Li, Cailin Zhengzhou Normal Univ Sch Informat Sci & Technol Zhengzhou 450044 Peoples R China Shandong Univ Technol Sch Civil & Architectural Engn Zibo 255000 Peoples R China

The image will be contaminated by noise during the imaging process, which severely degrades the image quality. It is necessary to filter the collected image. With the increasing amount of image data, the traditional single-processor or multiprocessor computing equipment has been unable to meet the requirements of real-time data processing. In this paper, the computational model of weighted mean filtering and the characteristics of high performance computer architecture are studied. An efficient hierarchical image weighted mean filtering parallel algorithm for Open Computing Language (OpenCL) is designed and implemented, which can fully express the parallelism of the computing model. The parallel algorithm takes full account of the characteristics of image discrete convolution computing and the multi-layer logic architecture of high performance computer, deeply excavates the parallelism of the computing platform and computing model, and realizes the efficient task mapping from computing model to computing resources. The model is implemented in parallel with the two levels of work-group and work-item. The experimental results show that compared with the serial algorithm based on CPU, the parallel algorithm based on Open Multi-Processing (OpenMP) and the parallel algorithm based on Compute Unified Device Architecture (CUDA), the parallel algorithm of weighted mean filtering achieves 20.88 times, 18.52 times and 1.26 times acceleration ratio on the NVIDIA GPU computing platform based on OpenCL architecture, respectively. It realizes better computing performance and runs on different Graphic Processing Unit (GPU) computing platforms, and has good portability and scalability.

关键词： Graphics processing units Filtering parallel algorithms Filtering algorithms Computational modeling Acceleration Signal processing algorithms Weighted mean filtering Gaussian noise Graphic Processing Unit (GPU) Open Computing Language (OpenCL) parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

AN EFFICIENT parallel algorithm FOR THE SOLUTION OF BLOCK TRIDIAGONAL LINEAR-SYSTEMS

引用

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS 1995年第3期E78D卷 256-262页

作者： NARITOMI, T ASO, H Tohoku Univ Sendai-shi Japan

A parallel overlapping preconditioner is applied to ICCG method and the effect of the parallel preconditioning on the convergence of the method is investigated by solving large scale block tridiagonal linear systems arising from the discretization of Poisson's equation. Compared with the original ICCG method, the parallel preconditioned ICCG method can solve the problems in high parallelism with slight increasing the number of iterations. Furthermore, the speedup and the efficiency are evaluated for the parallel preconditioned ICCG method by substituting the experimental results into formulae of complexity. For example, when a domain of simulation is discretized on a 250 x 250 rectangular grid and the preconditioner is divided into 249 smaller ones, its speedup is 146.3 with the efficiency 0.59.

关键词： parallel algorithm TRIDIAGONAL LINEAR SYSTEM parallel PRECONDITIONING CONJUGATE GRADIENT METHOD

来源：评论

学校读者我要写书评

暂无评论

Convergence of a new parallel algorithm for the Navier-Stokes equations

引用

NONLINEAR ANALYSIS-REAL WORLD APPLICATIONS 2009年第1期10卷 23-41页

作者： Feng, Xinlong He, Yinnian Xi An Jiao Tong Univ Fac Sci Xian 710049 Shanxi Peoples R China Xinjiang Univ Coll Math & Syst Sci Urumqi 830046 Peoples R China

This paper deals with the convergence and stability of a new parallel algorithm and the error estimates for a particular case of the new parallel algorithm, which is used to solve the incompressible nonstationary Navier-Stokes equations. The theoretical results show that the scheme is (at least) conditionally stable and convergent. (c) 2007 Elsevier Ltd. All rights reserved.

关键词： Navier-Stokes equations parallel algorithm fractional step methods error estimation

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：