检索结果-内蒙古大学图书馆

parallelization of XPDP1 Code for Technology Applications of Plasmas

IEEE LATIN AMERICA TRANSACTIONS 2014年第2期12卷 122-128页

作者： Gomes, E. B. L. Roberto, M. Orellana, E. T. V. Amarante Segundo, G. S. Univ Estadual Santa Cruz Ilheus Bahia Brazil ITA Sao Jose Dos Campos SP Brazil

This work presents the main steps towards a parallel version of the PIC (Particle In Cell) code XPDP1 (X Plasma Device Planar 1-Dimensional), which uses a Monte Carlo procedure to treat collisions among the particles of different species of neutral and ionized pure gases such as argon, oxygen and others. The graphical interface of XPDP1 has been removed and it was parallelized by means of a hybrid approach, with message-passing for distributed memory (using MPI) and shared memory (using OpenMP). The tests for the efficiency and speedup were carried out on a hybrid homogeneous cluster and the results obtained show speedups of approximately ten for 32 cores on 4 servers, which allows the use of this code on problems which are infeasible with the serial version.

关键词： parallel programming PIC/MCC codes Cold plasma

来源：评论

学校读者我要写书评

暂无评论

SPHeRe

引用

JOURNAL OF SUPERCOMPUTING 2014年第1期68卷 274-301页

作者： Amin, Muhammad Bilal Batool, Rabia Khan, Wajahat Ali Lee, Sungyoung Huh, Eui-Nam Kyung Hee Univ Dept Comp Engn Ubiquitous Comp Lab Yongin 446701 Gyeonggi Do South Korea Kyung Hee Univ Dept Comp Engn Internet Comp & Network Secur Lab Yongin 446701 Gyeonggi Do South Korea

The abundance of semantically related information has resulted in semantic heterogeneity. Ontology matching is among the utilized techniques implemented for semantic heterogeneity resolution;however, ontology matching being a computationally intensive problem can be a time-consuming process. Medium to large-scale ontologies can take from hours up to days of computation time depending upon the utilization of computational resources and complexity of matching algorithms. This delay in producing results, makes ontology matching unsuitable for semantic web-based interactive and semireal-time systems. This paper presents SPHeRe, a performance-based initiative that improves ontology matching performance by exploiting parallelism over multicore cloud platform. parallelism has been overlooked by ontology matching systems. SPHeRe avails this opportunity and provides a solution by: (i) creating and caching serialized subsets of candidate ontologies with single-step parallel loading;(ii) lightweight matcher-based and redundancy-free subsets result in smaller memory footprints and faster load time;and (iii) implementing data parallelism based distribution over subsets of candidate ontologies by exploiting the multicore distributed hardware of cloud platform for parallel ontology matching and execution. Performance evaluation of SPHeRe on a trinode (12-core) private cloud infrastructure has shown up to 3 times faster ontology load time with up to 8 times smaller memory footprint than Web Ontology Language (OWL) frameworks Jena and OWLAPI. Furthermore, by utilizing the computation resources most efficiently, SPHeRe provides the best scalability in contrast with other ontology matching systems, i.e., GOMMA, LogMap, AROMA, and AgrMaker. On a private cloud instance with 8 cores, SPHeRe outperforms the most performance efficient ontology matching system GOMMA by 40 % in scalability and 4 times in performance.

关键词： Ontology matching Semantic web Matching performance parallel matching parallel programming Cloud computing

来源：评论

学校读者我要写书评

暂无评论

Burnup analysis of the VVER-1000 reactor using thorium-based fuel

引用

KERNTECHNIK 2014年第6期79卷 478-483页

作者： Korkmaz, M. E. Agar, O. Buyuker, E. Karamanoglu Mehmetbey Univ Fac Kamil Ozdag Sci Karaman Turkey

This paper aims to investigate Th-232/U-233 fuel cycles in a VVER-1000 reactor through calculation by computer. The 3D core geometry of VVER-1000 system was designed using the Serpent Monte Carlo 1.1.19 Code. The Serpent Code using parallel programming interface (Message Passing Interface-MPI), was run on a workstation with 12-core and 48 GB RAM. Th-232/U-235/U-238 oxide mixture was considered as fuel in the core, when the mass fraction of Th-232 was increased as 0.05-0.1- 0.2-0.3-0.4 respectively, the mass fraction of U-238 equally was decreased. In the system, the calculations were made for 3000 MW thermal power. For the burnup analyses, the core is assumed to deplete from initial fresh core up to a burnup of 16 MWd/kgU without refuelling considerations. In the burnup calculations, a burnup interval of 360 effective full power days (EFPDs) was defined. According to burnup, the mass changes of the Th-232, U-233, U-238, Np-237, Pu-239, Am-241 and Cm-244 were evaluated, and also flux and criticality of the system were calculated in dependence of the burnup rate.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

AN INTEGRATED SOFTWARE FRAMEWORK FOR LOCALIZATION IN WIRELESS SENSOR NETWORK

引用

COMPUTING AND INFORMATICS 2014年第2期33卷 369-386页

作者： Marks, Michal Niewiadomska-Szynkiewicz, Ewa Kolodziej, Joanna Warsaw Univ Technol Inst Control & Computat Engn PL-00665 Warsaw Poland Res & Acad Comp Network NASK PL-02796 Warsaw Poland Cracow Univ Technol Inst Comp Sci PL-31155 Krakow Poland

Devices that form a wireless sensor network (WSN) system are usually remotely deployed in large numbers in a sensing field. WSNs have enabled numerous applications, in which location awareness is usually required. Therefore, numerous localization systems are provided to assign geographic coordinates to each node in a network. In this paper, we describe and evaluate an integrated software framework WSNLS (Wireless Sensor Network Localization System) that provides tools for network nodes localization and the environment for tuning and testing various localization schemes. Simulation experiments can be performed on parallel and multi-core computers or computer clusters. The main component of the WSNLS framework is the library of solvers for calculating the geographic coordinates of nodes in a network. Our original solution implemented in WSNLS is the localization system that combines simple geometry of triangles and stochastic optimization to determine the position of nodes with unknown location in the sensing field. We describe and discuss the performance of our system due to the accuracy of location estimation and computation time. Numerical results presented in the paper confirm that our hybrid scheme gives accurate location estimates of network nodes in sensible computing time, and the WSNLS framework can be successfully used for efficient tuning and verification of different localization techniques.

关键词： Localization wireless sensor networks stochastic optimization simulated annealing genetic algorithm parallel programming HPC software framework

来源：评论

学校读者我要写书评

暂无评论

Efficient parallel LOD-FDTD Method for Debye-Dispersive Media

引用

IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION 2014年第3期62卷 1330-1338页

作者： Hemmi, Tadashi Costen, Fumie Garcia, Salvador Himeno, Ryutaro Yokota, Hideo Mustafa, Mehshan Univ Manchester Sch Elect & Elect Engn Manchester M13 9PL Lancs England RIKEN Ctr Adv Photon Image Proc Res Team Wako Saitama 3510198 Japan Univ Granada Fac Sci E-18071 Granada Spain RIKEN Adv Ctr Comp & Commun Wako Saitama 3510198 Japan

The locally one-dimensional finite-difference time-domain (LOD-FDTD) method is a promising implicit technique for solving Maxwell's equations in numerical electromagnetics. This paper describes an efficient message passing interface (MPI)-parallel implementation of the LOD-FDTD method for Debye-dispersive media. Its computational efficiency is demonstrated to be superior to that of the parallel ADI-FDTD method. We demonstrate the effectiveness of the proposed parallel algorithm in the simulation of a bio-electromagnetic problem: the deep brain stimulation (DBS) in the human body.

关键词： Distributed memory systems electromagnetic fields electromagnetic propagation in dispersive media finite-difference methods numerical analysis parallel programming time-domain analysis

来源：评论

学校读者我要写书评

暂无评论

Recent advances in the Message Passing Interface

引用

INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS 2014年第4期28卷 387-389页

作者： Garcia Blas, Javier Carretero, Jesus Univ Carlos III Madrid E-28903 Getafe Spain

来源：评论

学校读者我要写书评

暂无评论

Process Placement in Multicore Clusters: Algorithmic Issues and Practical Techniques

引用

IEEE TRANSACTIONS ON parallel AND DISTRIBUTED SYSTEMS 2014年第4期25卷 993-1002页

作者： Jeannot, Emmanuel Mercier, Guillaume Tessier, Francois INRIA Bordeaux Sud Ouest F-33405 Talence France

Current generations of NUMA node clusters feature multicore or manycore processors. programming such architectures efficiently is a challenge because numerous hardware characteristics have to be taken into account, especially the memory hierarchy. One appealing idea to improve the performance of parallel applications is to decrease their communication costs by matching the communication pattern to the underlying hardware architecture. In this paper, we detail the algorithm and techniques proposed to achieve such a result: first, we gather both the communication pattern information and the hardware details. Then we compute a relevant reordering of the various process ranks of the application. Finally, those new ranks are used to reduce the communication costs of the application.

关键词： parallel programming high performance computing multicore processing

来源：评论

学校读者我要写书评

暂无评论

Software Transactional Memory for GPU Architectures

IEEE COMPUTER ARCHITECTURE LETTERS

引用

IEEE COMPUTER ARCHITECTURE LETTERS 2014年第1期13卷 49-52页

作者： Xu, Yunlong Wang, Rui Goswami, Nilanjan Li, Tao Qian, Depei Xi An Jiao Tong Univ Sch Elect & Informat Engn Xian 710049 Peoples R China Beihang Univ Sch Engn & Comp Sci Beijing Peoples R China Univ Florida ECE Dept Gainesville FL USA

To make applications with dynamic data sharing among threads benefit from GPU acceleration, we propose a novel software transactional memory system for GPU architectures (GPU-STM). The major challenges include ensuring good scalability with respect to the massively multithreading of GPUs, and preventing livelocks caused by the SIMT execution paradigm of GPUs. To this end, we propose (1) a hierarchical validation technique and (2) an encounter-time lock-sorting mechanism to deal with the two challenges, respectively. Evaluation shows that GPU-STM outperforms coarse-grain locks on GPUs by up to 20x.

关键词： Multicore Processors parallel programming Run-time Environments SIMD Processors

来源：评论

学校读者我要写书评

暂无评论

parallel Massive-Thread Electromagnetic Transient Simulation on GPU

引用

IEEE TRANSACTIONS ON POWER DELIVERY 2014年第3期29卷 1045-1053页

作者： Zhou, Zhiyin Dinavahi, Venkata Univ Alberta Dept Elect & Comp Engn Edmonton AB T6G 2V4 Canada

The electromagnetic transient (EMT) simulation of a large-scale power system consumes so much computational power that parallel programming techniques are urgently needed in this area. For example, realistic-sized power systems include thousands of buses, generators, and transmission lines. Massive-thread computing is one of the key developments that can increase the EMT computational capabilities substantially when the processing unit has enough hardware cores. Compared to the traditional CPU, the graphic-processing unit (GPU) has many more cores with distributed memory which can offer higher data throughput. This paper proposes a massive-thread EMT program (MT-EMTP) and develops massive-thread parallel modules for linear passive elements, the universal line model, and the universal machine model for offline EMT simulation. An efficient node-mapping structure is proposed to transform the original power system admittance matrix into a block-node diagonal sparse format to exploit the massive-thread parallel GPU architecture. The developed MT-EMTP program has been tested on large-scale power systems of up to 2458 three-phase buses with detailed component modeling. The simulation results and execution times are compared with mainstream commercial software, EMTP-RV, to show the improvement in performance with equivalent accuracy.

关键词： Electromagnetic transient analysis graphics processors massive-thread parallel algorithms parallel programming power system simulation

来源：评论

学校读者我要写书评

暂无评论

ZEBRA: Data-Centric Contention Management in Hardware Transactional Memory

引用

IEEE TRANSACTIONS ON parallel AND DISTRIBUTED SYSTEMS 2014年第5期25卷 1359-1369页

作者： Titos-Gil, Ruben Negi, Anurag Acacio, Manuel E. Garcia, Jose M. Stenstrom, Per Chalmers Univ Technol Dept Comp Sci & Engn S-41296 Gothenburg Sweden Univ Murcia Dept Ingn Tecnol & Comp E-30100 Murcia Spain

Transactional contention management policies show considerable variation in relative performance with changing workload characteristics. Consequently, incorporation of fixed-policy Transactional Memory (TM) in general purpose computing systems is suboptimal by design and renders such systems susceptible to pathologies. Of particular concern are Hardware TM (HTM) systems where traditional designs have hardwired policies in silicon. Adaptive HTMs hold promise, but pose major challenges in terms of design and verification costs. In this paper, we present the ZEBRA HTM design, which lays down a simple yet high-performance approach to implement adaptive contention management in hardware. Prior work in this area has associated contention with transactional code blocks. However, we discover that by associating contention with data (cache blocks) accessed by transactional code rather than the code block itself, we achieve a neat match in granularity with that of the cache coherence protocol. This leads to a design that is very simple and yet able to track closely or exceed the performance of the best performing policy for a given workload. ZEBRA, therefore, brings together the inherent benefits of traditional eager HTMs-parallel commits-and lazy HTMs-good optimistic concurrency without deadlock avoidance mechanisms-, combining them into a low-complexity design.

关键词： Multicore architectures transactional memory parallel programming cache coherence protocols

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：