检索结果-内蒙古大学图书馆

IEEE Conference on High Performance Extreme Computing (HPEC)

作者： Chansup Byun Jeremy Kepner William Arcand David Bestor Bill Bergeron Vijay Gadepally Matthew Hubbell Peter Michaleas Julie Mullen Andrew Prout Antonio Rosa Charles Yee Albert Reuther MIT Lincoln Laboratory Lexington MA U.S.A

ISBN: (纸本)9781509035267

The map-reduce parallel programming model has become extremely popular in the big data community. Many big data workloads can benefit from the enhanced performance offered by supercomputers. LLMapReduce provides the familiar map-reduce parallel programming model to big data users running on a supercomputer. LLMapReduce dramatically simplifies map-reduce programming by providing simple parallel programming capability in one line of code. LLMapReduce supports all programming languages and many schedulers. LLMapReduce can work with any application without the need to modify the application. Furthermore, LLMapReduce can overcome scaling limits in the map-reduce parallel programming model via options that allow the user to switch to the more efficient single-program-multiple-data (SPMD) parallel programming model. These features allow users to reduce the computational overhead by more than 10x compared to standard map-reduce for certain applications. LLMapReduce is widely used by hundreds of users at MIT. Currently LLMapReduce works with several schedulers such as SLURM, Grid Engine and LSF.

关键词： parallel programming Ecosystems Big data Supercomputers Databases Arrays Computational modeling

来源：评论

学校读者我要写书评

暂无评论

Applying parallel design patterns to embarassingly parallel problem

Applying parallel design patterns to embarassingly parallel ...

引用

Colossal Data Analysis and Networking (CDAN)

作者： Nilesh Maltare Chetan Chudasama Department of Information Technology MBICT India

ISBN: (纸本)9781509006700

This paper present experiment done with mapping of Algorithmic structure pattern with implementation pattern. Selection of implementation patterns and data structures needs to consider parallel platform for which they are developed and they also affects the performance of program. The experiment results supports need of Adaptive patterns for parallel programming to develop software's runs on different parallel environments.

关键词： Software parallel processing Algorithm design and analysis parallel programming Data structures Concurrent computing Software algorithms

来源：评论

学校读者我要写书评

暂无评论

Towards automatic parallelization of sequential programs and efficient use of resources in HPC centers

Towards automatic parallelization of sequential programs and...

引用

International Conference on High Performance Computing & Simulation (HPCS)

作者： Javier Corral-García Jose-Luis González-Sánchez Miguel A. Pérez-Toledano COMPUTAEX/Cenitx - Extremadura Supercomputing Technological Innovation and Research Center Computer Science Department University of Extremadura Spain

High-Performance Computing (HPC) is becoming increasingly required by scientists of all branches in order to achieve their desired research results. However, carrying out their research in an HPC center can be a difficult task when they are new to parallel programming. These users need support in the parallelization and optimization of their codes, in order to obtain reliable results as well as make efficient use of the available resources. For this purpose, a novel code analyzer for automatic parallelization of sequential codes is presented, focused on resource management of a supercomputing center, where efficient scheduling decisions and energy saving become key challenges. Thus, this paper aims to introduce the analyzer so as to demonstrate the importance of using it, specially in terms of efficiency, when running parallel codes in HPC centers.

关键词： Schedules Message systems parallel programming Energy consumption Optimization Computers

来源：评论

学校读者我要写书评

暂无评论

A learner-centered computational experience in nanotechnology for undergraduate STEM students

A learner-centered computational experience in nanotechnolog...

引用

IEEE Integrated STEM Education Conference (ISEC)

作者： Abu Asaduzzaman Ramazan Asmatulu Wichita State University

According to recent studies, the current state of Science, Technology, Engineering, and Mathematics (STEM) education in the U.S. has not been impressive. In this paper, we introduce an interdisciplinary learnercentered computational experience in nanotechnology for undergraduate STEM students. Three important tasks associated with this work are applying power-aware data-regrouping based parallel computation to analyze nanoscale materials; updating and/or developing “handson computational experience in nanotechnology” courses; and assessing students' learning experience and interest in high performance computing (HPC) simulation for nanotechnology. The proposed activities have potential to improve motivation, engagement, and learning of STEM students, enhancing the Engaged Student Learning environment. The tasks described in this work incorporate many-core computing, nanomanufacturing, and energy savings, and are aimed at advancing HPC with fundamental understanding of nanostructured fiber behavior, which in turn will allow the use of effective materials for renewable energy conversion. Activities to address industry-oriented realworld problems will attract new students to the STEM education, as the job market in related fields is growing.

关键词： Education Nanoscale devices programming profession parallel processing parallel programming

来源：评论

学校读者我要写书评

暂无评论

A Efficient Algorithm for Molecular Dynamics Simulation on Hybrid CPU-GPU Computing Platforms

A Efficient Algorithm for Molecular Dynamics Simulation on H...

引用

International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery

作者： Dapu Li Wei Ai Yu Ye Jie Liang College of Information Science and Engineering Hunan University Changsha Hunan 410082 China

ISBN: (纸本)9781509040940

In this article, an efficient parallel algorithm for a hybrid CPU-GPU platform is proposed to enable large-scale molecular dynamics (MD) simulations of the metal solidification process. The results, implemented the parallel algorithm program on the hybrid CPU-GPU platform shows better performance than the program based on previous algorithms running on the CPU cluster platform. By contrast, the total execution time of the new program has been obviously decreased. Particularly, because of the use of the modified load balancing method, the neighbor list update time is approximately zero. The parallel program based on the CUDA+OpenMP model shows a factor of 6 16-core calculation speedups compared to the parallel program based on the MPI+OpenMP model, and the optimal computational efficiency is achieved in the simulation system including 10,000,000 aluminum atoms. Finally, the good consistency between them verifies the correctness of the algorithm efficiently, by comparison of the theoretical results and experimental results.

关键词： Graphics processing units Computational modeling parallel algorithms Mathematical model Runtime Load modeling Heuristic algorithms parallel algorithms Load modeling Runtime Heuristic algorithms Computational modeling Mathematical Model efficient algorithm parallel programming Graphics Processing Unit Molecular Dynamics Simulation Platform Simulation Systems execution time

来源：评论

学校读者我要写书评

暂无评论

Maximally permissive deadlock avoidance for resource allocation systems with R/W-locks

引用

DISCRETE EVENT DYNAMIC SYSTEMS-THEORY AND APPLICATIONS 2015年第1-2期25卷 31-63页

作者： Nazeem, Ahmed Reveliotis, Spyros United Airlines Chicago IL USA Georgia Inst Technol Sch Ind Syst Engn Atlanta GA 30332 USA

This paper extends the existing theory on maximally permissive liveness-enforcing supervision of resource allocation systems (RAS) so that it can handle RAS with reader / writer (R/W-) locks. A key challenge that is posed by this new RAS class stems from the fact that the underlying state space is not necessarily finite. We effectively address this obstacle by taking advantage of special structure that exists in the set of inadmissible states and enables a finite representation of this set through its minimal elements.

关键词： Reader/writer locks Deadlock parallel programming Supervisory control Discrete event systems Right-closed sets

来源：评论

学校读者我要写书评

暂无评论

The Suzaku Pattern programming Framework

The Suzaku Pattern Programming Framework

引用

IEEE International Symposium on parallel and Distributed Processing Workshops and Phd Forum (IPDPSW)

作者： Barry Wilkinson Clayton Ferner University of North Carolina Charlotte Charlotte NC USA University of North Carolina Wilmington Wilmington NC USA

ISBN: (纸本)9781509036837

Suzaku is a pattern programming framework that enables programmers to create pattern-based parallel MPI programs without writing the MPI message-passing code implicit in the patterns. The purpose of this framework is to simplify message-passing programming and create better structured programs based upon established parallel design patterns. The focus for developing Suzaku is on teaching parallel programming. This paper covers the main features of Suzaku and describes our experiences using it in parallel programming classes.

关键词： parallel programming Master-slave Arrays Message systems Distributed computing Computers

来源：评论

学校读者我要写书评

暂无评论

Transactional memory support in the IBM POWER8 processor

引用

IBM JOURNAL OF RESEARCH AND DEVELOPMENT 2015年第1期59卷 8:1-8:14页

作者： Le, H. Q. Guthrie, G. L. Williams, D. E. Michael, M. M. Frey, B. G. Starke, W. J. May, C. Odaira, R. Nakaike, T. IBM Syst & Technol Grp Austin TX 78758 USA IBM Res Div Thomas J Watson Res Ctr Yorktown Hts NY 10598 USA IBM Res Tokyo IBM Res Div Koto Ku Tokyo 1350061 Japan

With multi-core processors, parallel programming has taken on greater importance. Traditional parallel programming techniques based on critical sections controlled by locking have several well-known drawbacks. To allow for more efficient parallel programming with higher performance, the IBM POWER8 (TM) processor implements a hardware transactional memory facility. Transactional memory allows groups of load and store operations to execute and commit as a single atomic unit without the use of traditional locks, thereby improving performance and simplifying the parallel programming model. The POWER8 transactional memory facility provides a robust capability to execute transactions that can survive interrupts. It also allows non-speculative accesses within transactions, which facilitates debugging and thread-level speculation. Unique challenges caused by implementing transactional memory on top of the Power ISA (Instruction Set Architecture) weakly consistent memory model are addressed. We detail the Power ISA transactional memory architecture, the POWER8 implementation of this architecture, and two practical uses of this architecture-Transactional Lock Elision (TLE) and Thread-Level Speculation (TLS)-and provide performance results for these uses.

关键词： Instruction sets Memory architecture Multicore processing parallel programming Program processors

来源：评论

学校读者我要写书评

暂无评论

Incremental closeness centrality in distributed memory

引用

parallel COMPUTING 2015年 47卷 3-18页

作者： Sariyuece, Ahmet Erdem Saule, Erik Kaya, Kamer Catalyuerek, Uemit V. Ohio State Univ Dept Biomed Informat Columbus OH 43210 USA Ohio State Univ Dept Comp Sci & Engn Columbus OH 43210 USA Ohio State Univ Dept Elect & Comp Engn Columbus OH 43210 USA Univ North Carolina Charlotte Dept Comp Sci Charlotte NC USA Sabanci Univ Dept Comp Sci & Engn Istanbul Turkey

Networks are commonly used to model traffic patterns, social interactions, or web pages. The vertices in a network do not possess the same characteristics: some vertices are naturally more connected and some vertices can be more important. Closeness centrality (CC) is a global metric that quantifies how important is a given vertex in the network. When the network is dynamic and keeps changing, the relative importance of the vertices also changes. The best known algorithm to compute the CC scores makes it impractical to recompute them from scratch after each modification. In this paper, we propose STREAMER, a distributed memory framework for incrementally maintaining the closeness centrality scores of a network upon changes. It leverages pipelined, replicated parallelism, and SpMM-based BFSs, and it takes NUMA effects into account. It makes maintaining the closeness centrality values of real-life networks with millions of interactions significantly faster and obtains almost linear speedups on a 64 nodes 8 threads/node cluster. (C) 2015 Elsevier B.V. All rights reserved.

关键词： Closeness centrality Incremental centrality BPS parallel programming Cluster computing

来源：评论

学校读者我要写书评

暂无评论

Student research poster: A scalable general purpose system for large-scale graph processing

Student research poster: A scalable general purpose system f...

引用

International Conference on parallel Architecture and Compilation Techniques (PACT)

作者： Jiawen Sun Hans Vandierendonck Dimitrios S. Nikolopoulos Queen's University Belfast Belfast Queen's University Belfast University Road UK BT7 1LR

Graph analytics is an important and computationally demanding class of data analytics. It is essential to balance scalability, ease-of-use and high performance in large scale graph analytics. As such, it is necessary ... 详细信息

ISBN: (纸本)9781509053087

关键词： Polymers parallel processing Optimization Scalability parallel programming Sun Roads

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：