检索结果-内蒙古大学图书馆

A parallel AUXILIARY GRID ALGEBRAIC MULTIGRID METHOD FOR GRAPHIC PROCESSING UNITS

SIAM JOURNAL ON SCIENTIFIC COMPUTING 2013年第3期35卷 C263-C283页

作者： Wang, Lu Hu, Xiaozhe Cohen, Jonathan Xu, Jinchao Penn State Univ Dept Math University Pk PA 16802 USA Penn State Univ Ctr Computat Math & Applicat University Pk PA 16802 USA NVIDIA Res Ann Arbor MI 48105 USA

In this paper, we develop a new parallel auxiliary grid algebraic multigrid (AMG) method to leverage the power of graphic processing units (GPUs). In the construction of the hierarchical coarse grid, we use a simple and fixed coarsening procedure based on a region quadtree generated from an auxiliary grid. This allows us to explicitly control the sparsity patterns and operator complexities of the AMG solver. This feature provides (nearly) optimal load balancing and predictable communication patterns on shape regular grids, which makes our new algorithm suitable for parallel computing, especially on GPUs. We also design a parallel smoother based on the special coloring of the quadtree to accelerate the convergence rate and improve the parallel performance of this solver. Based on the CUDA toolkit [NVIDIA CUDA Programming Guide, NVIDIA Corp., 2010], we implemented our new parallel auxiliary grid AMG method on GPUs and the numerical results of this implementation demonstrate the efficiency of our new method for (nearly) isotropic problems. The results achieve an average speedup of over 4 on quasi-uniform grids and 2 on shape regular grids when compared to the AMG implementation in CUSP [M. Garland and N. Bell, CUSP: Generic parallel algorithms for Sparse Matrix and Graph Computations, http://***/(2010)].

关键词： algebraic multigrid method aggregation nonlinear AMLI-cycle auxiliary grid GPU parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

A Study on parallel Computation for 3D Magneto-Telluric Modeling Using the Staggered-Grid Finite Difference Method

引用

Chinese Journal of Geophysics 2013年第3期56卷 287-295页

作者： LI Yan HU Xiang-Yun YANG Wen-Cai WEI Wen-Bo FANG Hui HAN Bo PENG Rong-Hua Institute of Geophysics and Geomatics China University of Geosciences Wuhan 430074 China China Aero Geophysical Survey and Remote Sensing Center for Land and Resources Beijing 100083 China Chinese Academy of Geological Sciences Beijing 100037 China School of Geophysics and Information Technology China University of Geosciences Beijing 100083 China

Computation time and memory requirements are two common problems for magnetotelluric (MT) modeling of three-dimensional conductivity structure. We develop a new parallel processing scheme that can efficiently improve the computational speed of 3D MT modeling. The scheme of 3D MT modeling based on the staggered-grid finite difference method is implemented in the frequency domain, and the calculation process of the EM field for each frequency is independent. Therefore, considering the naturally parallelizable character, the whole computation task of all frequencies can be divided into many minor calculation tasks for single or multiple frequencies, which will be assigned to different computing nodes and calculated in a parallel manner. In this work, by adopting master-slave parallel mode and parallel computation with frequencies scheme, we have implemented the parallel computation of 3D MT modeling using MPI on the Dawn TC5000A high-performance parallel platform. Furthermore, we tested our parallel algorithm of 3D MT modeling using two 3D theoretical models and analyzed the calculation efficiency on a multiple-nodes computer, and the results show that the parallel algorithm is effective and efficient, which lays a solid foundation for subsequent three-dimensional parallel MT inversion.

关键词： Magnetotelluric 3D forward modeling Staggered-grid finite difference parallel algorithm MPI

来源：评论

学校读者我要写书评

暂无评论

Data assimilation for large-scale computational models 54

Data assimilation for large-scale computational models

引用

54th AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics and Materials Conference

作者： Khalil, Mohammad Subber, Waad Sarkar, Abhijit Department of Civil and Environmental Engineering Mackenzie Building Carleton University Colonel By Drive Ottawa ON K1S 5B6 Canada

来源：评论

学校读者我要写书评

暂无评论

Research on the parallel algorithm for Self-similar Network Traffic Simulation

Research on the Parallel Algorithm for Self-similar Network ...

引用

2009 2nd IEEE International Conference on Computer Science and Information Technology（ICCSIT 2009）

作者： ZHANG Huachuan Institute of Machine Intelligence Nankai University Tianjin,China XU Jing,TIAN Jie Institute of Machine Intelligence Nankai University Tianjin,China

As the web application is world wide used,system's performance,especially reliability,becomes more *** performance testing tools such as QA Load and LoadRunner will generate the stress data with the fixed *** in the real time,network traffic is *** focus on generating test data to simulate network traffic accurately for web application reliability *** statistical results of network traffic show that the property of the self-similarity is ubiquitous in web *** generating self-similar network traffic is *** nowadays,there is a bottleneck in generating network traffic by single *** need a parallel method to solve this *** this paper we propose a distributed system based on a parallel algorithm to generate self-similar traffic using the Fraction Gaussian Noise(FGN)*** experiment results show that the network traffic generated by the distributed system has self-similar property.

关键词： parallel algorithm Distributed System Generating Network Traffic Self-similar Model

来源：评论

学校读者我要写书评

暂无评论

parallel MULTI-PROPOSAL AND MULTI-CHAIN MCMC FOR CALCULATING P-VALUE OF GENOME-WIDE ASSOCIATION STUDY

引用

parallel PROCESSING LETTERS 2013年第3期23卷

作者： Zhao, Di Ni, Shenghua Ohio State Univ Ctr Cognit & Brain Sci Dept Psychol Columbus OH 43210 USA Vanderbilt Univ Inst Med & Publ Hlth Nashville TN 37235 USA

In this paper, by the novel idea of integrating multiple-proposal algorithm and multiple-chain algorithm by parallel computing, we develop a highly efficient sampler for approximating statistical distributions: parallel Multi-proposal and Multi-chain Markov Chain Monte Carlo (pMPMC3), and we illustrate the high performance of this sampler by calculating P-value (odds ratio significance) for Genome Wide Association Study (GWAS). Computational results show that, by setting the convergence condition as the standard deviation of P-value is less than 10(-3), pMPMC3 with 4 proposals and 4 chains obtains a convergent P-value within 10(6) iterations, while the conventional method Monte Carlo simulation does not obtain convergent P-values even in 10(7) iterations. We also test pMPMC3 by changing the number of chains, the number of proposals and the size of the dataset on a cluster with maximum 600 processes, the algorithm scales well.

关键词： Multi-proposal Multi-chain Markov Chain Monte Carlo parallel algorithm Genome-Wide Association Studies

来源：评论

学校读者我要写书评

暂无评论

Dynamic Pick-up and Delivery Vehicle Routing Problem with Ready-time and Deadline

Dynamic Pick-up and Delivery Vehicle Routing Problem with Re...

引用

第三十二届中国控制会议

作者： Quan XiongWen Xu Ya College of Information Technical Sciences Nankai University Department of Logistics Management TEDA CollegeNankai University

ISBN: (纸本)9781479900305

In this paper,a dynamic delivery and pick-up vehicle routing problem(DVRP) with ready-time and deadline of customer goods is *** using the rolling horizon approach,the DVRP is modeled and *** each decision epoch, the open vehicle routing problem with multiple depots is *** on the adaptive memory programming,a master-slave parallel tabu search algorithm is developed,and an insertion procedure is also suggested for the real-time urgent orders. Computational experiment reveals that the parallel tabu search algorithm is of high practical value for solving the dynamic vehicle routing problem.

关键词： Dynamic vehicle routing Rolling horizon parallel algorithm Tabu search

来源：评论

学校读者我要写书评

暂无评论

Design and Implementation of Texture Mapping in parallel

Design and Implementation of Texture Mapping in Parallel

引用

2013 3rd International Conference on Electric and Electronics(EEIC 2013)

作者： Wen Xu Jungang Han College of Computer science and Technology Xi'an University of Posts and Telecommunications

Texture mapping is an important part of the realistic graphics rendering process. In this paper we parallelize the algorithm for traditional texture mapping to improve the running speed of the program. The experimenta... 详细信息

关键词： computer graphics parallel algorithm texture mapping

来源：评论

学校读者我要写书评

暂无评论

Start-up flow in a three-dimensional lid-driven cavity by means of a massively parallel direction splitting algorithm

引用

INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN FLUIDS 2012年第7期68卷 856-871页

作者： Guermond, J. L. Minev, P. D. Texas A&M Univ Dept Math College Stn TX 77843 USA Univ Alberta Dept Math & Stat Sci Edmonton AB T6G 2G1 Canada

The purpose of this paper is to validate a new highly parallelizable direction splitting algorithm. The parallelization capabilities of this algorithm are illustrated by providing a highly accurate solution for the start-up flow in a three-dimensional impulsively started lid-driven cavity of aspect ratio 1x1x2 at Reynolds numbers 1000 and 5000. The computations are done in parallel (up to 1024 processors) on adapted grids of up to 2 billion nodes in three space dimensions. Velocity profiles are given at dimensionless times t=4, 8, and 12;at least four digits are expected to be correct at Re=1000. Copyright (C) 2011 John Wiley & Sons, Ltd.

关键词： lid-driven cavity direction splitting unsteady flow incompressible flow parallel algorithm MAC stencil three dimensional

来源：评论

学校读者我要写书评

暂无评论

A FAST ITERATIVE METHOD FOR SOLVING THE EIKONAL EQUATION ON TETRAHEDRAL DOMAINS

引用

SIAM JOURNAL ON SCIENTIFIC COMPUTING 2013年第5期35卷 C473-C494页

作者： Fu, Zhisong Kirby, Robert M. Whitaker, Ross T. Univ Utah Sci Comp & Imaging Inst Salt Lake City UT 84112 USA

Generating numerical solutions to the eikonal equation and its many variations has a broad range of applications in both the natural and computational sciences. Efficient solvers on cutting-edge, parallel architectures require new algorithms that may not be theoretically optimal, but that are designed to allow asynchronous solution updates and have limited memory access patterns. This paper presents a parallel algorithm for solving the eikonal equation on fully unstructured tetrahedral meshes. The method is appropriate for the type of fine-grained parallelism found on modern massively-SIMD architectures such as graphics processors and takes into account the particular constraints and capabilities of these computing platforms. This work builds on previous work for solving these equations on triangle meshes;in this paper we adapt and extend previous two-dimensional strategies to accommodate three-dimensional, unstructured, tetrahedralized domains. These new developments include a local update strategy with data compaction for tetrahedral meshes that provides solutions on both serial and parallel architectures, with a generalization to inhomogeneous, anisotropic speed functions. We also propose two new update schemes, specialized to mitigate the natural data increase observed when moving to three dimensions, and the data structures necessary for efficiently mapping data to parallel SIMD processors in a way that maintains computational density. Finally, we present descriptions of the implementations for a single CPU, as well as multicore CPUs with shared memory and SIMD architectures, with comparative results against state-of-the-art eikonal solvers.

关键词： Hamilton-Jacobi equation eikonal equation tetrahedral mesh parallel algorithm shared memory multiple-processor computer system graphics processing unit

来源：评论

学校读者我要写书评

暂无评论

parallelization of a Bio-inspired Computational Model for the Simulation of 3-D Multicellular Tissue Growth

引用

Procedia Computer Science 2013年 20卷 391-398页

作者： Belgacem Ben Youssef Department of Computer Engineering CCIS King Saud University Riyadh 11543 Saudi Arabia

The use of parallelism may overcome some of the constraints imposed by single processor computing systems. Besides offering faster solutions, applications that are parallelized can solve bigger or more complex problems. For instance, simulations can be run at finer resolutions while physical phenomena can be potentially modeled more realistically. We describe in this paper the development of a bio-inspired parallel algorithm used in the three-dimensional simulation of multicellular tissue growth. We report on the different components of the model where cellular automata is used to model different types of cell populations that execute persistent random walks on the computational grid, collide, and proliferate until they reach confluence. We also discuss the main issues encountered in the parallelization of the model and its implementation on a parallel machine.

关键词： Bio-inspired simulation model parallel algorithm tissue growth cellular automata

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：