检索结果-内蒙古大学图书馆

41st International Conference on Parallel Processing, ICPP 2012

作者： Pilla, Laércio L. Ribeiro, Christiane Pousa Cordeiro, Daniel Mei, Chao Bhatele, Abhinav Navaux, Philippe O.A. Broquedis, François Méhaut, Jean-François Kale, Laxmikant V. Institute of Informatics Federal University of Rio Grande Do Sul Porto Alegre Brazil LIG Laboratory CEA/INRIA Grenoble University Grenoble France Department of Computer Science University of Illinois at Urbana-Champaign Urbana IL United States Center for Applied Scientific Computing Lawrence Livermore National Laboratory Livermore CA United States

ISBN: (纸本)9780769547961

Multi-core compute nodes with non-uniform memory access (NUMA) are now a common architecture in the assembly of large-scale parallel machines. On these machines, in addition to the network communication costs, the memory access costs within a compute node are also asymmetric. Ignoring this can lead to an increase in the data movement costs. Therefore, to fully exploit the potential of these nodes and reduce data access costs, it becomes crucial to have a complete view of the machine topology (i.e. the compute node topology and the interconnection network among the nodes). Furthermore, the parallel application behavior has an important role in determining how to utilize the machine efficiently. In this paper, we propose a hierarchical load balancing approach to improve the performance of applications on parallel multi-core systems. We introduce NucoLB, a topology-aware load balancer that focuses on redistributing work while reducing communication costs among and within compute nodes. NucoLB takes the asymmetric memory access costs present on NUMA multi-core compute nodes, the interconnection network overheads, and the application communication patterns into account in its balancing decisions. We have implemented NucoLB using the Charm++ parallel runtime system and evaluated its performance. Results show that our load balancer improves performance up to 20% when compared to state-of-the-art load balancers on three different NUMA parallel machines. © 2012 IEEE.

关键词： Topology

来源：评论

学校读者我要写书评

暂无评论

Heterogeneous Task Scheduling for Accelerated OpenMP

Heterogeneous Task Scheduling for Accelerated OpenMP

引用

International Symposium on Parallel and Distributed Processing (IPDPS)

作者： Thomas R.W. Scogland Barry Rountree Wu-chun Feng Bronis R. de Supinski Department of Computer Science Virginia Polytechnic Institute and State University Blacksburg VA USA Center for Applied Scientific Computing Lawrence Livemore National Laboratory Livermore CA USA

Heterogeneous systems with CPUs and computational accelerators such as GPUs, FPGAs or the upcoming Intel MIC are becoming mainstream. In these systems, peak performance includes the performance of not just the CPUs but also all available accelerators. In spite of this fact, the majority of programming models for heterogeneous computing focus on only one of these. With the development of Accelerated Open MP for GPUs, both from PGI and Cray, we have a clear path to extend traditional Open MP applications incrementally to use GPUs. The extensions are geared toward switching from CPU parallelism to GPU parallelism. However they do not preserve the former while adding the latter. Thus computational potential is wasted since either the CPU cores or the GPU cores are left idle. Our goal is to create a runtime system that can intelligently divide an accelerated Open MP region across all available resources automatically. This paper presents our proof-of-concept runtime system for dynamic task scheduling across CPUs and GPUs. Further, we motivate the addition of this system into the proposed Open MP for Accelerators standard. Finally, we show that this option can produce as much as a two-fold performance improvement over using either the CPU or GPU alone.

关键词： Graphics processing unit Dynamic scheduling Integrated circuits Acceleration Schedules Programming Runtime

来源：评论

学校读者我要写书评

暂无评论

Ecotoxicology data federation with SADI semantic web services 5

Ecotoxicology data federation with SADI semantic web service...

引用

5th International Workshop on Semantic Web Applications and Tools for Life sciences, SWAT4LS 2012

作者： Riazanov, Alexandre Hindle, Matthew M. Goudreau, E. Scott Martyniuk, Christopher J. Baker, Christopher J. O. Department of Computer Science and Applied Statistics University of New Brunswick Saint JohnNB Canada Canadian Rivers Institute Department of Biology University of New Brunswick Saint JohnNB Canada Canada Research in Molecular Ecology Edinburgh University United Kingdom SynthSys Edinburgh University United Kingdom School of Informatics Edinburgh University United Kingdom IPSNP Computing Inc. Canada

Biologists and biotechnologists need to draw information from numerous distributed and heterogeneous resources, such as online biomedical databases, nomenclatures and specialised bioinformatics tools. These tasks can benefit significantly from semantic data federation with SADI Semantic Web services where multiple resources exposed through SADI services are accessed as a single virtual SPARQL-queriable database. We provide evidence in support of this premise by creating and testing a kit of public SADI services for a number of bioinformatics databases and programs, and by demonstrating how it can be used to serve real information needs of ecotoxicology researchers, by using the services to answer some model queries.

关键词： Query processing

来源：评论

学校读者我要写书评

暂无评论

DI-MMAP: A High Performance Memory-Map Runtime for Data-Intensive Applications

DI-MMAP: A High Performance Memory-Map Runtime for Data-Inte...

引用

High Performance computing, Networking, Storage and Analysis (SCC), SC Companion:

作者： Brian Van Essen Henry Hsieh Sasha Ames Maya Gokhale Lawrence Livermore National Laboratory Livermore CA US Department of Computer Science University of California Los Angeles USA Center for Applied Scientific Computing Lawrence Livermore National Laboratory Livermore CA USA

ISBN: (纸本)9781467362184

We present DI-MMAP, a high-performance runtime that memory-maps large external data sets into an application's address space and shows significantly better performance than the Linux mmap system call. Our implementation is particularly effective when used with high performance locally attached Flash arrays on highly concurrent, latency-tolerant data-intensive HPC applications. We describe the kernel module and show performance results on a benchmark test suite and on a new bioinformatics metagenomic classification application. For the complex metagenomics classification application, DI-MMAP performs up to 4.88× better than standard Linux mmap.

关键词： Runtime Linux Benchmark testing Standards Databases Random access memory Instruction sets

来源：评论

学校读者我要写书评

暂无评论

Abstract: Exploring Performance Data with Boxfish

Abstract: Exploring Performance Data with Boxfish

引用

High Performance computing, Networking, Storage and Analysis (SCC), SC Companion:

作者： Katherine E. Isaacs Aaditya G. Landge Todd Gamblin Peer-Timo Bremer Valerio Pascucci Bernd Hamann Institute for Data Analysis and Visualization Department of Computer Science University of California Davis CA USA Scientific Computing and Imaging Institute University of Utah Salt Lake UT USA Center for Applied Scientific Computing Lawrence Livermore National Laboratories Livermore CA USA

The growth in size and complexity of scaling applications and the systems on which they run pose challenges in analyzing and improving their overall performance. With metrics coming from thousands or millions of processes, visualization techniques are necessary to make sense of the increasing amount of data. To aid the process of exploration and understanding, we announce the initial release of Boxfish, an extensible tool for manipulating and visualizing data pertaining to application behavior. Combining and visually presenting data and knowledge from multiple domains, such as the application's communication patterns and the hardware's network configuration and routing policies, can yield the insight necessary to discover the underlying causes of observed behavior. Boxfish allows users to query, filter and project data across these domains to create interactive, linked visualizations.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Lecture Notes in Computational science and Engineering: Preface

引用

Lecture Notes in Computational science and Engineering 2012年 87 LNCSE卷 v-vii页

作者： Forth, Shaun Hovland, Paul Phipps, Eric Utke, Jean Walther, Andrea Applied Mathematics and Scientific Computing Cranfield University Shrivenham Swindon United Kingdom Mathematics and Computer Science Division Argonne National Laboratory Argonne IL United States Sandia National Laboratory Albuquerque NM United States Department of Mathematics University of Paderborn Paderborn Germany

来源：评论

学校读者我要写书评

暂无评论

Modeling the Performance of an Algebraic Multigrid Cycle Using Hybrid MPI/OpenMP

Modeling the Performance of an Algebraic Multigrid Cycle Usi...

引用

International Conference on Parallel Processing (ICPP)

作者： Hormozd Gahvari William Gropp Kirk E. Jordan Martin Schulz Ulrike Meier Yang Department of Computer Science University of Illinois Urbana-Champaign Urbana IL USA Computational Science Center IBM Thomas J. Watson Research Center Cambridge MA USA Center for Applied Scientific Computing Lawrence Livemore National Laboratory Livermore CA USA

The rise of multicore cluster architectures has led to intense interest in using a combination of MPI and OpenMP to more effectively program these machines. We present a performance model for hybrid implementation of the solve cycle of algebraic multigrid (AMG), a popular iterative solver for large sparse linear systems and a key component of many scientific simulations. We validate the model on two leading parallel platforms, and discuss implications for applications programmed in a hybrid model on future machines.

关键词： Bandwidth Multicore processing Message systems Interpolation Computational modeling Instruction sets

来源：评论

学校读者我要写书评

暂无评论

Performance Modeling of Algebraic Multigrid on Blue Gene/Q: Lessons Learned

Performance Modeling of Algebraic Multigrid on Blue Gene/Q: ...

引用

High Performance computing, Networking, Storage and Analysis (SCC), SC Companion:

作者： Hormozd Gahvari William Gropp Kirk E. Jordan Martin Schulz Ulrike Meier Yang Department of Computer Science University of Illinois Urbana-Champaign Urbana IL USA IBM Thomas J. Watson Research Center Cambridge MA USA Center for Applied Scientific Computing Lawrence Livermore National Laboratories Livermore CA USA

ISBN: (纸本)9781467362184

The IBM Blue Gene/Q represents a large step in the evolution of massively parallel machines. It features 16-core compute nodes, with additional parallelism in the form of four simultaneous hardware threads per core, connected together by a five-dimensional torus network. Machines are being built with core counts in the hundreds of thousands, with the largest, Sequoia, featuring over 1.5 million cores. In this paper, we develop a performance model for the solve cycle of algebraic multigrid on Blue Gene/Q to help us understand the issues this popular linear solver for large, sparse linear systems faces on this architecture. We validate the model on a Blue Gene/Q at IBM, and conclude with a discussion of the implications of our results.

关键词： PERFORMANCE MODELING core interacting boson model IBM parallel machines Algebra color blue lessons learned

来源：评论

学校读者我要写书评

暂无评论

Faculty development in the EU ERAMIS project

Faculty development in the EU ERAMIS project

引用

IEEE Education Engineering (EDUCON)

作者： Agathe Merceron Jean-Michel Adam Sergio Lujan-Mora Marek Milosz Arto Toppinen Media Informatics department Beuth University of Applied Sciences Berlin Germany Laboratory of Informatics of Grenoble Pierre Mendes France University Grenoble France Department of Software and Computing Systems University of Alicante Alicante Spain Institute of Computer Science Lublin University of Technology Lublin Poland School of Engineering and Technology Savonia University of Applied Sciences Savonia Finland

The aim of the ERAMIS project is to create a Master degree “computer as a Second Competence” in 9 beneficiary universities of Kazakhstan, Kyrgyzstan and Russia. This contribution presents how faculty development is ... 详细信息

关键词： Educational institutions Training computer science Europe Informatics Materials

来源：评论

学校读者我要写书评

暂无评论

Lessons learned from academic teachers training in TEMPUS ERAMIS project

Lessons learned from academic teachers training in TEMPUS ER...

引用

International Conference on Interactive Collaborative Learning (ICL)

作者： Marek Miłosz Jean-Michel Adam Sergio Luján-Mora Agathe Merceron Institute of Computer Science Lublin University of Technology Lublin Poland Laboratory of Informatics of Grenoble (LIG) Pierre Mendès-France University Grenoble France Department of Software and Computing Systems University of Alicante Alicante Spain Media Informatics Department Beuth University of Applied Sciences Berlin Germany

ISBN: (纸本)9781467324250

The aim of the ERAMIS project is to set up a network of Master's degree “Informatics as a Second Competence” among 9 beneficiary universities of Kazakhstan, Kyrgyzstan and Russia, and 5 European universities. The trainings of beneficiary academic teachers are one of important tasks in this project. This contribution presents the implementation of trainings and lessons learned from those activities, especially from the trainers' point of view. This experience can be useful for similar kind of trainings.

关键词： Educational institutions Training Europe Organizations computer science Laboratories

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：