检索结果-内蒙古大学图书馆

2016 International Congress on Advances in Nuclear Power Plants, ICAPP 2016

作者： Zhu, Chenglin Li, Shuo Yan, Yuhang Yu, Hui Chen, Yixue State Nuclear Power Software Development Center SPIC National Energy Key Laboratory of Nuclear Power Software Changping District Beijing102209 China

ISBN: (纸本)9781510825949

Shared memory programming model, represented by OpenMP, has been developed rapidly with the development of multi-core technology. The convergence speed in Method of characteristics (MOC) for solving neutron transport equation is slow in lattice calculation of nuclear design code system. However the MOC is very suitable for parallel calculation. In this paper, the OpenMP parallel programming is applied in a new neutron transport lattice physics code COSLATC, which is one essential component of COSINE (Core and System Integrated Engine for design and analysis) software package. By analyzing the OpenMP programming model and studying the design form of fork-join parallel programming model, the energy group parallel calculation is adopted in MOC module. After studying the cost model of OpenMP parallel programming and analyzing the factors that affect the performance of OpenMP parallel algorithm, this paper proposed a series of OpenMP programming optimization methods, which include the expansion and merging parallel domain, optimal loop scheduling method, etc. Moreover, aiming at the problem of unordered migration of multithreads in operating system scheduling, the rationale of thread affinity technique in OpenMP criterion is analyzed, and the implementation scheme of its interface in computer component is designed as well. The numerical results show that the calculation results of parallelization method agreed well with the original serial calculation results. The better speedup and parallel efficiency performance will be achieved by OpenMP programming optimization methods and thread affinity technique.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

NASA Technical Reports Server (Ntrs) 20000108751: Mlp: a parallel programming Alternative to Mpi for New Shared Memory parallel Systems

引用

2016年

NASA Technical Reports Server (Ntrs) 20000108751: Mlp: a parallel programming Alternative to Mpi for New Shared Memory parallel Systems by NASA Technical Reports Server (Ntrs); NASA Technical Reports Server (Ntrs); pu... 详细信息

关键词： (ntrs) 20000108751: alternative compilers computational fluid dynamics counting cray computers memory memory (computers) messages multiprocessing (computers) nasa technical reports server (ntrs) parallel parallel programming silicon taft, james r.

来源：评论

学校读者我要写书评

暂无评论

NASA Technical Reports Server (Ntrs) 20020063612: F-Nets and Software Cabling: Deriving a Formal Model and Language for Portable parallel programming

引用

2016年

NASA Technical Reports Server (Ntrs) 20020063612: F-Nets and Software Cabling: Deriving a Formal Model and Language for Portable parallel programming by NASA Technical Reports Server (Ntrs); NASA Technical Reports Ser... 详细信息

关键词： (ntrs) 20020063612: algorithms architecture (computers) cabling: communication cables computer graphics deriving dinucci, david c. f-nets formalism mathematical models nasa technical reports server (ntrs) neural nets parallel programming semantics software engineering turing machines

来源：评论

学校读者我要写书评

暂无评论

NASA Technical Reports Server (Ntrs) 20020041930: Tolerant (parallel) programming

引用

2017年

NASA Technical Reports Server (Ntrs) 20020041930: Tolerant (parallel) programming by NASA Technical Reports Server (Ntrs); published by

关键词： (ntrs) (parallel) 20020041930: algorithms computation construction data processing dinucci, david c. folding messages nasa technical reports server (ntrs) nets parallel programming program verification (computers) programming programming languages reports software development tools subroutines

来源：评论

学校读者我要写书评

暂无评论

OpenMP as a high-level specification language for parallelism: And its use in evaluating parallel programming systems

Lecture Notes in Computer Science (including subseries Lectu...

引用

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2016年 9903 LNCS卷 141-155页

作者： Grossman, Max Shirako, Jun Sarkar, Vivek Department of Computer Science Rice University Houston United States

ISBN: (纸本)9783319455495

While OpenMP is the de facto standard of shared memory parallel programming models, a number of alternative programming models and runtime systems have arisen in recent years. Fairly evaluating these programming systems can be challenging and can require significant manual effort on the part of researchers. However, it is important to facilitate these comparisons as a way of advancing both the available OpenMP runtimes and the research being done with these novel programming systems. In this paper we present the OpenMP-to-X framework, an open source tool for mapping OpenMP constructs and APIs to other parallel programming systems. We apply OpenMP-to-X to the HClib parallel programming library, and use it to enable a fair and objective comparison of performance and programmability among HClib, GNU OpenMP, and Intel OpenMP. We use this investigation to expose performance bottlenecks in both the Intel OpenMP and HClib runtimes, to motivate improvements to the HClib programming model and runtime, and to propose potential extensions to the OpenMP standard. Our performance analysis shows that, across a wide range of benchmarks, HClib demonstrates significantly less volatility in its performance with a median standard deviation of 1.03% in execution times and outperforms the two OpenMP implementations on 15 out of 24 benchmarks. © Springer International Publishing Switzerland 2016.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

parallel programming in Actor-Based Applications via OpenCL 15

Parallel Programming in Actor-Based Applications via OpenCL

引用

16th Annual Middleware Conference

作者： Harvey, Paul Hentschel, Kristian Sventek, Joseph Univ Glasgow Sch Comp Sci Glasgow Lanark Scotland Univ Oregon Dept Comp & Informat Sci Eugene OR 97403 USA

ISBN: (纸本)9781450336185

GPU and multicore hardware architectures are commonly used in many different application areas to accelerate problem solutions relative to single CPU architectures. The typical approach to accessing these hardware architectures requires embedding logic into the programming language used to construct the application;the two primary forms of embedding are: calls to API routines to access the concurrent functionality, or pragmas providing concurrency hints to a language compiler such that particular blocks of code are targeted to the concurrent functionality. The former approach is verbose and semantically bankrupt, while the success of the latter approach is restricted to simple, static uses of the functionality. This paper presents an extension to an existing actor-based programming model and runtime to support executing applications on parallel hardware architectures. Besides the glove-like fit of a kernel to the actor abstraction, quantitative code analysis shows that actor-based kernels are always significantly simpler than API-based coding, and generally simpler than pragma-based coding. The structuring of applications in this manner, enables the runtime to automate the initialisation and interaction with these parallel hardware platforms. Performance measurements show that the overheads of actor-based kernels are commensurate to API based kernels, and range from equivalent to vastly improved for pragma-based annotations, both for sample and real world applications.

关键词： parallel programming actors performance OpenCL middleware

来源：评论

学校读者我要写书评

暂无评论

parallel programming of model-based geostatistics for improved reservoir characterization 17

Parallel programming of model-based geostatistics for improv...

引用

17th Annual Conference of the International Association for Mathematical Geosciences, IAMG 2015

作者： Al-Mudhafar, W. Hakim, S. Craft and Hawkins Department of Petroleum Engineering Louisiana State University United States Institute of Drilling Technology and Fluid Mining Freiberg University of Mining and Technology Germany

ISBN: (纸本)9783000503375

To overcome the restriction of unbiased predictors in kriging interpolation, Bayesian Kriging integrates prior distribution of variogram parameters such as coefficients, data vari-Ance, range, and nugget to be adopted as a qualified gueb in the spatial estimation . The observation uncertainty is represented as a posterior distribution and predictive parame-Ter distribution avoiding unrealistic small regions within the observations to attain optimal unbiased linear interpolation through Bayesian kriging algorithm. Prior to estimate the pre-dictive spatial distributions, the procedure includes multiple computations of an emperical variogram for the petrophysical properties given posterior distribution of the variogram pa-rameters to create many equiprobable reservoir stochastic images. Based on the statistical evaluation, these realizations are ranked to select three quartiles (P10, P50, and P90).

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

OpenH: A Novel programming Model and API for Developing Portable parallel Programs on Heterogeneous Hybrid Servers

引用

IEEE ACCESS 2024年 12卷 23666-23694页

作者： Farrelly, Simon Manumachu, Ravi Reddy Lastovetsky, Alexey Univ Coll Dublin Sch Comp Sci Dublin 4 Ireland

Heterogeneous nodes composed of a multicore CPU and accelerators are today's norm in high-performance computing (HPC) platforms due to their superior performance and energy efficiency. Tools such as OpenCL and hybrid combinations such as OpenMP plus OpenACC are used for developing portable parallel programs for such nodes. However, these tools have some drawbacks, including a lack of compiler support for nested parallelism, performance portability, automatic heterogeneous workload distribution, user-friendly thread placement, and processor affinity essential to the portable performance of hybrid programs executing on such nodes. In this paper, we propose OpenH, a novel programming model and library API for developing portable parallel programs on heterogeneous hybrid servers composed of a multicore CPU and one or more different types of accelerators. OpenH integrates Pthreads, OpenMP, and OpenACC seamlessly to facilitate the development of hybrid parallel programs. An OpenH hybrid parallel program starts as a single main thread, creating a group of Pthreads called hosting Pthreads. A hosting Pthread then leads the execution of a software component of the program, either an OpenMP multithreaded component running on the CPU cores or an OpenACC (or OpenMP) component running on one of the accelerators of the server. The OpenH library provides API functions that allow programmers to get the configuration of the executing environment and bind the hosting Pthreads (and hence the execution of components) of the program to the CPU cores of the hybrid server to get the best performance. We illustrate the OpenH programming model and library API using two hybrid parallel applications based on matrix multiplication and 2D fast Fourier transform for the most general case of a hybrid hyperthreaded server comprising $p$ computing devices. Finally, we demonstrate the practical performance and energy consumption of OpenH for the hybrid parallel matrix multiplication application on a

关键词： parallel computing parallel programming heterogeneous platform hybrid platform accelerators OpenMP OpenACC Pthreads

来源：评论

学校读者我要写书评

暂无评论

An intelligent system for segmenting lung image using parallel programming

An intelligent system for segmenting lung image using parall...

引用

International Conference on Data Mining and Advanced Computing (SAPIENCE)

作者： Shiju Thomas M.Y. Sherin Babu Dept. of Computer Science Rajagiri College of Social Sciences (Autonomous) Cochin Kerala India Dept. of Computer Science Cochin University of Science and Technology Cochin Kerala India

ISBN: (纸本)9781467385954

Computed tomography is used nowadays for analyzing the problem in the human body and it plays a very important role in diagnosing defects in the patients. Computed tomography only became feasible with the development of computer signal processing capabilities. Technology is improved to capture the inner parts of the human body from 2D to 3D and also from 3D to 4D. A tomographic image is a cross sectional images or slices through the body. A radiologist has to analyze the slices one by one for detecting any defect, it takes long time when the number of slices is more and hence the time for doing the analysis was more. This paper presents a system which predicts the affected areas of human lungs from slices obtained from CT scan Machine, using parallel image processing and enhancing algorithms, to assist radiologists to make their final decisions. The proposed model was tested on the human lung for the detection of cancer. The scanned images are stored in the form of Digital Imaging and Communication in Medicine (DICOM).

关键词： Lungs Image segmentation Computed tomography DICOM Cancer parallel programming

来源：评论

学校读者我要写书评

暂无评论

Evaluating OpenMP 4.0's Effectiveness as a Heterogeneous parallel programming Model

Evaluating OpenMP 4.0's Effectiveness as a Heterogeneous Par...

引用

IEEE International Symposium on parallel and Distributed Processing Workshops and Phd Forum (IPDPSW)

作者： Matt Martineau Simon McIntosh-Smith Wayne Gaudin HPC Group University of Bristol Bristol United Kingdom Atomic Weapons Establishment Aldermaston United Kingdom

ISBN: (纸本)9781509036837

Although the OpenMP 4.0 standard has been available since 2013, support for GPUs has been absent up until very recently, with only a handful of experimental compilers available. In this work we evaluate the performance of Cray's new NVIDIA GPU targeting implementation of OpenMP 4.0, with the mini-apps TeaLeaf, CloverLeaf and BUDE. We successfully port each of the applications, using a simple and consistent design throughout, and achieve performance on an NVIDIA K20X that is comparable to Cray's OpenACC in all cases. BUDE, a compute bound code, required 2.2x the runtime of an equivalently optimised CUDA code, which we believe is caused by an inflated frequency of control flow operations and less efficient arithmetic optimisation. Impressively, both TeaLeaf and CloverLeaf, memory bandwidth bound codes, only required 1.3x the runtime of hand-optimised CUDA implementations. Overall, we find that OpenMP 4.0 is a highly usable open standard capable of performant heterogeneous execution, making it a promising option for scientific application developers.

关键词： Standards Graphics processing units Performance evaluation Complexity theory parallel processing parallel programming Runtime

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：