检索结果-内蒙古大学图书馆

A matrix-free approach to efficient affine-linear image registration on CPU and GPU

JOURNAL OF REAL-TIME IMAGE PROCESSING 2017年第1期13卷 205-225页

作者： Ruehaak, Jan Koenig, Lars Tramnitzke, Florian Koestler, Harald Modersitzki, Jan Fraunhofer MEVIS Maria Goeppert Str 3 D-23562 Lubeck Germany Univ Erlangen Nurnberg Lehrstuhl Syst Simulat Cauerstr 11 D-91058 Erlangen Germany Univ Lubeck Inst Math & Image Comp Maria Goeppert Str 3 D-23562 Lubeck Germany

This paper presents a generic approach to highly efficient image registration in two and three dimensions. Both monomodal and multimodal registration problems are considered. We focus on the important class of affine-linear transformations in a derivative-based optimization framework. Our main contribution is an explicit formulation of the objective function gradient and Hessian approximation that allows for very efficient, parallel derivative calculation with virtually no memory requirements. The flexible parallelism of our concept allows for direct implementation on various hardware platforms. Derivative calculations are fully matrix free and operate directly on the input data, thereby reducing the auxiliary space requirements from to . The proposed approach is implemented on multicore CPU and GPU. Our GPU code outperforms a conventional matrix-based CPU implementation by more than two orders of magnitude, thus enabling usage in real-time scenarios. The computational properties of our approach are extensively evaluated, thereby demonstrating the performance gain for a variety of real-life medical applications.

关键词： Image registration Computational efficiency parallel algorithms GPU programming Real-time processing

来源：评论

学校读者我要写书评

暂无评论

High-order numerical schemes based on difference potentials for 2D elliptic problems with material interfaces

引用

APPLIED NUMERICAL MATHEMATICS 2017年 111卷 64-91页

作者： Albright, Jason Epshteyn, Yekaterina Medvinsky, Michael Xia, Qing Univ Utah Dept Math Salt Lake City UT 84112 USA North Carolina State Univ Raleigh NC 27695 USA

Numerical approximations and computational modeling of problems from Biology and Materials Science often deal with partial differential equations with varying coefficients and domains with irregular geometry. The challenge here is to design an efficient and accurate numerical method that can resolve properties of solutions in different domains/subdomains, while handling the arbitrary geometries of the domains. In this work, we consider 2D elliptic models with material interfaces and develop efficient high-order accurate methods based on Difference Potentials for such problems. (C) 2016 IMACS. Published by Elsevier B.V. All rights reserved.

关键词： Boundary value problems Piecewise-constant coefficients High-order accuracy Difference potentials Boundary projections Interface problems Non-matching grids Mixed-order parallel algorithms Application to the simulation of the biological cell electropermeabilization model

来源：评论

学校读者我要写书评

暂无评论

Global and Local Partitioning of the Charge Transferred in the Parr-Pearson Model

引用

JOURNAL OF PHYSICAL CHEMISTRY A 2017年第20期121卷 4019-4029页

作者： Ulises Orozco-Valencia, Angel Gazquez, Jose L. Vela, Alberto Ctr Invest & Estudios Avanzados Dept Quim Av Inst Politecn Nacl 2508 Ciudad De Mexico 07360 Mexico Univ Autonoma Metropolitana Iztapalapa Dept Quim Av San Rafael Atlixco 186 Ciudad De Mexico 09340 Mexico

Through a simple proposal, the charge transfer obtained from the cornerstone theory of Parr and Pearson is partitioned, for each reactant, in two channels: an electrophilic, through which the species accepts electrons, and the other, a nucleophilic, where the species donates electrons. It is shown that this global model allows us to determine unambiguously the charge-transfer mechanism prevailing in a given reaction. The partitioning is extended to include local effects through the Fukui functions of the reactants. This local model is applied to several emblematic reactions in organic and inorganic chemistry, and we show that besides improving the correlations obtained with the global model it provides valuable information concerning the atoms in the reactants playing the most important roles in the reaction and thus improving our understanding of the reaction under study.

关键词： parallel algorithms CHARGE transfer ELECTROPHILIC addition reactions INORGANIC chemistry ORGANIC chemistry

来源：评论

学校读者我要写书评

暂无评论

Data Flow algorithms for Processors with Vector Extensions

引用

JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY 2017年第1期87卷 21-31页

作者： Barford, Lee Bhattacharyya, Shuvra S. Liu, Yanzhou Keysight Technol Inc Keysight Labs 561 Keystone Ave Unit 434 Reno NV 89503 USA Univ Maryland College Pk MD 20742 USA Tampere Univ Technol Tampere Finland

Full use of the parallel computation capabilities of present and expected CPUs and GPUs requires use of vector extensions. Yet many actors in data flow systems for digital signal processing have internal state (or, equivalently, an edge that loops from the actor back to itself) that impose serial dependencies between actor invocations that make vectorizing across actor invocations impossible. Ideally, issues of inter-thread coordination required by serial data dependencies should be handled by code written by parallel programming experts that is separate from code specifying signal processing operations. The purpose of this paper is to present one approach for so doing in the case of actors that maintain state. We propose a methodology for using the parallel scan (also known as prefix sum) pattern to create algorithms for multiple simultaneous invocations of such an actor that results in vectorizable code. Two examples of applying this methodology are given: (1) infinite impulse response filters and (2) finite state machines. The correctness and performance of the resulting IIR filters and one class of FSMs are studied.

关键词： Digital signal processing Data flow computing Vector processors parallel algorithms Graphics processing units

来源：评论

学校读者我要写书评

暂无评论

Highly scalable implementation of an implicit matrix-free solver for gas dynamics on GPU-accelerated clusters

引用

JOURNAL OF SUPERCOMPUTING 2017年第2期73卷 631-638页

作者： Menshov, Igor Pavlukhin, Pavel Keldysh Inst Appl Math Moscow 125047 Russia Res & Dev Inst Kvant Moscow 125438 Russia

A numerical approach for solving gas dynamics on Cartesian grids is considered which employs an implicit time marching scheme with the matrix-free Lower-Upper Symmetric Gauss-Seidel (LU-SGS) method for solving discrete equations. Boundary conditions are treated with an embedded-boundary method. The method has two attractive features-(1) algorithmic uniformity of calculations and (2) structured memory accesses that well fit massively parallel architectures with GPU accelerators. We propose a novel CUDA+MPI computational algorithm scalable up to hundreds of GPUs and give in-depth analysis of its implementation (interoperability issues, libraries tuning).

关键词： CFD LU-SGS parallel algorithms CUDA MPI

来源：评论

学校读者我要写书评

暂无评论

Primal-Dual parallel Algorithm for Optimal Content Delivery in Cloud CDNs 8

Primal-Dual Parallel Algorithm for Optimal Content Delivery ...

引用

8th IEEE International Conference on Computational Intelligence and Computing Research, ICCIC 2017

作者： Mahesh, Gadiraju R Maheswara Rao, V.V. Shankar, R Shiva G Sirisha, Gn V Dept. of C.S.E. S.R.K.R. Engineering College Bhimavaram A.P. India

ISBN: (纸本)9781509066209

Content delivery networks have been providing content delivery services for the last two decades using their own infrastructure. Now-a-days content delivery networks have the better option of using storage cloud sites as edge servers. The problems of replicating the content required by the users on optimal sites in Cloud and assigning the sites to users are considered in this work. Given a set of current user requests and cloud sites potential to the user, the combined problem of finding the optimal sites for content placement and content dissemination is set-cover problem. The Previous works solved this problem by using greedy algorithm. Primal-dual parallel algorithm for optimal content delivery in Cloud content delivery networks is proposed in this work. The proposed algorithm is an efficient parallel algorithm that requires only local information. Primal-dual algorithm takes less time than greedy algorithm and the experimental results demonstrate the fact. © 2017 IEEE.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

New results on an improved parallel EM algorithm for estimating generalized latent variable models 81st

引用

81st annual meeting of the Psychometric Society, 2016

作者： von Davier, Matthias National Board of Medical Examiners 3750 Market Street PhiladelphiaPA19104-3102 United States

ISBN: (纸本)9783319562933

The second generation of a parallel algorithm for generalized latent variable models, including MIRT models and extensions, on the basis of the general diagnostic model (GDM) is presented. This new development further improves the performance of the parallel-E parallel-M algorithm presented in an earlier report by means of additional computational improvements that produce even larger gains in performance. The additional gain achieved by this second-generation parallel algorithm reaches factor 20 for several of the examples reported with a sixfold gain based on the first generation. The estimation of a multidimensional IRT model for large-scale data may show a larger reduction in runtime compared to a multiple-group model which has a structure that is more conducive to parallel processing of the E-step. Multiple population models can be arranged such that the parallelism directly exploits the ability to estimate multiple latent variable distributions separately in independent threads of the algorithm. © Springer International Publishing AG 2017.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

GPU-based HEVC intra-prediction module

引用

JOURNAL OF SUPERCOMPUTING 2017年第1期73卷 455-468页

作者： Galiano, V. Migallon, H. Herranz, V. Pinol, P. Lopez-Granado, O. Malumbres, M. P. Miguel Hernandez Univ Phys & Comp Architecture Dept Elche 03202 Spain Miguel Hernandez Univ Ctr Operat Res Elche 03202 Spain

The HEVC video coding standard requires nearly 70 % more time than H.264/AVC to encode a video sequence. Manycore architectures can considerably help to reduce the coding time. In this paper, we propose the use of GPUs to perform the intra-picture prediction without any R/D loss. We have evaluated our proposal and compared the results with the ones obtained when running on a CPU. The results show that a time reduction of up to 85 % can be obtained without any R/D loss.

关键词： parallel algorithms Video coding HEVC GPUs Performance Intra-prediction Manycore

来源：评论

学校读者我要写书评

暂无评论

About one parallel algorithm of solving non-local contact problem for parabolic equations 11

About one parallel algorithm of solving non-local contact pr...

引用

11th International Conference on Computer Science and Information Technologies, CSIT 2017

作者： Davitashvili, Tinatin Meladze, Hamlet Skhirtladze, Nugzar Faculty of Exact and Natural Sciences IV. Javakhishvili Tbilisi State University Tbilisi Georgia N. Muskhelishvili Institute of Computational Mathematics GTU St. Andrew the First Called Georgian University Tbilisi Georgia Caucasus University Tbilisi Georgia

ISBN: (纸本)9781538628317;9781538628300

In the present work, the initial-boundary problem with non-local contact condition for heat (diffusion) equation is considered. For the stated problem, the existence and uniqueness of the solution is proved. The constructed iteration process allows one to reduce the solution of the initial non-classical problem to the solution of a sequence of classical Cauchy-Dirichlet problems. The convergence of the proposed iterative process is proved;the speed of convergence is estimated. The algorithm is suitable for parallel implementation. The specific problem is considered as an example and solved numerically. © 2017 IEEE.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Novel Graphics Processing Unit-Based parallel algorithms for Understanding Species Diversity in Forests 12

Novel Graphics Processing Unit-Based Parallel Algorithms for...

引用

Simulation Multiconference

作者： Michael Keenan Ivan Komarov Roshan M. D'Souza Rick Riolo Comples Systems Simulation Lab University of Wisconsin-Milwaukee Center for the Study of Complex Systems University of Michigan

ISBN: (纸本)9781618397881

The mechanisms which lead to high tree species diversity in forests are not yet fully understood. One of the leading theories is that the natural enemies' interaction can give rise to a survival advantage for rare tree species over more common species. One way of exploring such observations is through the use of individual based modeling. An individual-based model (IBM) is a bottom up simulation where the bulk dynamics emerge from the interaction of individual constituents. Due to their emergent nature, IBMs are population sensitive where achieving a high degree of accuracy is synonymous with matching system population sizes. Consequently such models may run into the millions of individuals and become computationally intensive. Here the computing power of graphics processing units (GPUs) is used to overcome this computation limitation. The algorithms developed here for GPUs allow this model to be scaled into the millions of individuals and run on standard desktop computers. This effectively puts supercomputing power at the finger-tips of researchers, students, and forest management services alike. The parallel implementation developed here was compared against a serial implementation running on the central processing unit. The results show a significant perfomance gain for the parallel implementation while maintaining statistical accuracy. This shows that realistically sized models can be efficiently executed on inexpensive mass-market desktop computer hardware.

关键词： Computational Ecology Forest Diversity parallel algorithms parallel algorithms Forest Species diversity biodiversity desktop computers Graphics processing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：