检索结果-内蒙古大学图书馆

parallelization in the time dimension of four-dimensional variational data assimilation

QUARTERLY JOURNAL OF THE ROYAL METEOROLOGICAL SOCIETY 2017年第703期143卷 1136-1147页

作者： Fisher, Michael Guerol, Selime European Ctr Medium Range Weather Forecasts Reading England

The current evolution of computer architectures towards increasing parallelism requires a corresponding evolution towards more parallel data assimilation algorithms. In this article, we consider parallelization of weak-constraint four-dimensional variational data assimilation (4D-Var) in the time dimension. We categorize algorithms according to whether or not they admit such parallelization and introduce a new, highly parallel weak-constraint 4D-Var algorithm based on a saddle-point representation of the underlying optimization problem. The potential benefits of the new saddle-point formulation are illustrated with a simple two-level quasi-geostrophic model.

关键词： 4D-Var data assimilation parallel algorithms saddle-point methods

来源：评论

学校读者我要写书评

暂无评论

Data Flow algorithms for Processors with Vector Extensions

引用

JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY 2017年第1期87卷 21-31页

作者： Barford, Lee Bhattacharyya, Shuvra S. Liu, Yanzhou Keysight Technol Inc Keysight Labs 561 Keystone Ave Unit 434 Reno NV 89503 USA Univ Maryland College Pk MD 20742 USA Tampere Univ Technol Tampere Finland

Full use of the parallel computation capabilities of present and expected CPUs and GPUs requires use of vector extensions. Yet many actors in data flow systems for digital signal processing have internal state (or, equivalently, an edge that loops from the actor back to itself) that impose serial dependencies between actor invocations that make vectorizing across actor invocations impossible. Ideally, issues of inter-thread coordination required by serial data dependencies should be handled by code written by parallel programming experts that is separate from code specifying signal processing operations. The purpose of this paper is to present one approach for so doing in the case of actors that maintain state. We propose a methodology for using the parallel scan (also known as prefix sum) pattern to create algorithms for multiple simultaneous invocations of such an actor that results in vectorizable code. Two examples of applying this methodology are given: (1) infinite impulse response filters and (2) finite state machines. The correctness and performance of the resulting IIR filters and one class of FSMs are studied.

关键词： Digital signal processing Data flow computing Vector processors parallel algorithms Graphics processing units

来源：评论

学校读者我要写书评

暂无评论

High-order numerical schemes based on difference potentials for 2D elliptic problems with material interfaces

引用

APPLIED NUMERICAL MATHEMATICS 2017年 111卷 64-91页

作者： Albright, Jason Epshteyn, Yekaterina Medvinsky, Michael Xia, Qing Univ Utah Dept Math Salt Lake City UT 84112 USA North Carolina State Univ Raleigh NC 27695 USA

Numerical approximations and computational modeling of problems from Biology and Materials Science often deal with partial differential equations with varying coefficients and domains with irregular geometry. The challenge here is to design an efficient and accurate numerical method that can resolve properties of solutions in different domains/subdomains, while handling the arbitrary geometries of the domains. In this work, we consider 2D elliptic models with material interfaces and develop efficient high-order accurate methods based on Difference Potentials for such problems. (C) 2016 IMACS. Published by Elsevier B.V. All rights reserved.

关键词： Boundary value problems Piecewise-constant coefficients High-order accuracy Difference potentials Boundary projections Interface problems Non-matching grids Mixed-order parallel algorithms Application to the simulation of the biological cell electropermeabilization model

来源：评论

学校读者我要写书评

暂无评论

Highly scalable implementation of an implicit matrix-free solver for gas dynamics on GPU-accelerated clusters

引用

JOURNAL OF SUPERCOMPUTING 2017年第2期73卷 631-638页

作者： Menshov, Igor Pavlukhin, Pavel Keldysh Inst Appl Math Moscow 125047 Russia Res & Dev Inst Kvant Moscow 125438 Russia

A numerical approach for solving gas dynamics on Cartesian grids is considered which employs an implicit time marching scheme with the matrix-free Lower-Upper Symmetric Gauss-Seidel (LU-SGS) method for solving discrete equations. Boundary conditions are treated with an embedded-boundary method. The method has two attractive features-(1) algorithmic uniformity of calculations and (2) structured memory accesses that well fit massively parallel architectures with GPU accelerators. We propose a novel CUDA+MPI computational algorithm scalable up to hundreds of GPUs and give in-depth analysis of its implementation (interoperability issues, libraries tuning).

关键词： CFD LU-SGS parallel algorithms CUDA MPI

来源：评论

学校读者我要写书评

暂无评论

Primal-Dual parallel Algorithm for Optimal Content Delivery in Cloud CDNs 8

Primal-Dual Parallel Algorithm for Optimal Content Delivery ...

引用

8th IEEE International Conference on Computational Intelligence and Computing Research, ICCIC 2017

作者： Mahesh, Gadiraju R Maheswara Rao, V.V. Shankar, R Shiva G Sirisha, Gn V Dept. of C.S.E. S.R.K.R. Engineering College Bhimavaram A.P. India

ISBN: (纸本)9781509066209

Content delivery networks have been providing content delivery services for the last two decades using their own infrastructure. Now-a-days content delivery networks have the better option of using storage cloud sites as edge servers. The problems of replicating the content required by the users on optimal sites in Cloud and assigning the sites to users are considered in this work. Given a set of current user requests and cloud sites potential to the user, the combined problem of finding the optimal sites for content placement and content dissemination is set-cover problem. The Previous works solved this problem by using greedy algorithm. Primal-dual parallel algorithm for optimal content delivery in Cloud content delivery networks is proposed in this work. The proposed algorithm is an efficient parallel algorithm that requires only local information. Primal-dual algorithm takes less time than greedy algorithm and the experimental results demonstrate the fact. © 2017 IEEE.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

GPU-based HEVC intra-prediction module

引用

JOURNAL OF SUPERCOMPUTING 2017年第1期73卷 455-468页

作者： Galiano, V. Migallon, H. Herranz, V. Pinol, P. Lopez-Granado, O. Malumbres, M. P. Miguel Hernandez Univ Phys & Comp Architecture Dept Elche 03202 Spain Miguel Hernandez Univ Ctr Operat Res Elche 03202 Spain

The HEVC video coding standard requires nearly 70 % more time than H.264/AVC to encode a video sequence. Manycore architectures can considerably help to reduce the coding time. In this paper, we propose the use of GPUs to perform the intra-picture prediction without any R/D loss. We have evaluated our proposal and compared the results with the ones obtained when running on a CPU. The results show that a time reduction of up to 85 % can be obtained without any R/D loss.

关键词： parallel algorithms Video coding HEVC GPUs Performance Intra-prediction Manycore

来源：评论

学校读者我要写书评

暂无评论

New results on an improved parallel EM algorithm for estimating generalized latent variable models 81st

引用

81st annual meeting of the Psychometric Society, 2016

作者： von Davier, Matthias National Board of Medical Examiners 3750 Market Street PhiladelphiaPA19104-3102 United States

ISBN: (纸本)9783319562933

The second generation of a parallel algorithm for generalized latent variable models, including MIRT models and extensions, on the basis of the general diagnostic model (GDM) is presented. This new development further improves the performance of the parallel-E parallel-M algorithm presented in an earlier report by means of additional computational improvements that produce even larger gains in performance. The additional gain achieved by this second-generation parallel algorithm reaches factor 20 for several of the examples reported with a sixfold gain based on the first generation. The estimation of a multidimensional IRT model for large-scale data may show a larger reduction in runtime compared to a multiple-group model which has a structure that is more conducive to parallel processing of the E-step. Multiple population models can be arranged such that the parallelism directly exploits the ability to estimate multiple latent variable distributions separately in independent threads of the algorithm. © Springer International Publishing AG 2017.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

About one parallel algorithm of solving non-local contact problem for parabolic equations 11

About one parallel algorithm of solving non-local contact pr...

引用

11th International Conference on Computer Science and Information Technologies, CSIT 2017

作者： Davitashvili, Tinatin Meladze, Hamlet Skhirtladze, Nugzar Faculty of Exact and Natural Sciences IV. Javakhishvili Tbilisi State University Tbilisi Georgia N. Muskhelishvili Institute of Computational Mathematics GTU St. Andrew the First Called Georgian University Tbilisi Georgia Caucasus University Tbilisi Georgia

ISBN: (纸本)9781538628317;9781538628300

In the present work, the initial-boundary problem with non-local contact condition for heat (diffusion) equation is considered. For the stated problem, the existence and uniqueness of the solution is proved. The constructed iteration process allows one to reduce the solution of the initial non-classical problem to the solution of a sequence of classical Cauchy-Dirichlet problems. The convergence of the proposed iterative process is proved;the speed of convergence is estimated. The algorithm is suitable for parallel implementation. The specific problem is considered as an example and solved numerically. © 2017 IEEE.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Novel Graphics Processing Unit-Based parallel algorithms for Understanding Species Diversity in Forests 12

Novel Graphics Processing Unit-Based Parallel Algorithms for...

引用

Simulation Multiconference

作者： Michael Keenan Ivan Komarov Roshan M. D'Souza Rick Riolo Comples Systems Simulation Lab University of Wisconsin-Milwaukee Center for the Study of Complex Systems University of Michigan

ISBN: (纸本)9781618397881

The mechanisms which lead to high tree species diversity in forests are not yet fully understood. One of the leading theories is that the natural enemies' interaction can give rise to a survival advantage for rare tree species over more common species. One way of exploring such observations is through the use of individual based modeling. An individual-based model (IBM) is a bottom up simulation where the bulk dynamics emerge from the interaction of individual constituents. Due to their emergent nature, IBMs are population sensitive where achieving a high degree of accuracy is synonymous with matching system population sizes. Consequently such models may run into the millions of individuals and become computationally intensive. Here the computing power of graphics processing units (GPUs) is used to overcome this computation limitation. The algorithms developed here for GPUs allow this model to be scaled into the millions of individuals and run on standard desktop computers. This effectively puts supercomputing power at the finger-tips of researchers, students, and forest management services alike. The parallel implementation developed here was compared against a serial implementation running on the central processing unit. The results show a significant perfomance gain for the parallel implementation while maintaining statistical accuracy. This shows that realistically sized models can be efficiently executed on inexpensive mass-market desktop computer hardware.

关键词： Computational Ecology Forest Diversity parallel algorithms parallel algorithms Forest Species diversity biodiversity desktop computers Graphics processing

来源：评论

学校读者我要写书评

暂无评论

parallel algorithms FOR FLUID-STRUCTURE INTERACTION PROBLEMS IN HAEMODYNAMICS

引用

SIAM JOURNAL ON SCIENTIFIC COMPUTING 2011年第4期33卷 1598-1622页

作者： Crosetto, Paolo Deparis, Simone Fourestey, Gilles Quarteroni, Alfio Ecole Polytech Fed Lausanne IACS Chair Modelling & Sci Comp CMCS CH-1015 Lausanne Switzerland Politecn Milan MOX Dipartimento Matemat F Brioschi I-20133 Milan Italy

The increasing computational load required by most applications and the limits in hardware performances affecting scientific computing contributed in the last decades to the development of parallel software and architectures. In fluid-structure interaction (FSI) for haemodynamic applications, parallelization and scalability are key issues (see [L. Formaggia, A. Quarteroni, and A. Veneziani, eds., Cardiovascular Mathematics: Modeling and Simulation of the Circulatory System, Modeling, Simulation and Applications 1, Springer, Milan, 2009]). In this work we introduce a class of parallel preconditioners for the FSI problem obtained by exploiting the block-structure of the linear system. We stress the possibility of extending the approach to a general linear system with a block-structure, then we provide a bound in the condition number of the preconditioned system in terms of the conditioning of the preconditioned diagonal blocks, and finally we show that the construction and evaluation of the devised preconditioner is modular. The preconditioners are tested on a benchmark three-dimensional (3D) geometry discretized in both a coarse and a fine mesh, as well as on two physiological aorta geometries. The simulations that we have performed show an advantage in using the block preconditioners introduced and confirm our theoretical results.

关键词： blood-flow models fluid-structure interaction finite elements preconditioners parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：