检索结果-内蒙古大学图书馆

2nd Int. workshop on Performance Modeling, Benchmarking and simulation of High Performance Computing Systems, PMBS'11, Held as Part of the 24th ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC'11

作者： Mudalige, G.R. Giles, M.B. Bertolli, C. Kelly, P.H.J. Oxford E-Research Centre University of Oxford United Kingdom Dept. of Computing Imperial College London United Kingdom

ISBN: (纸本)9781450311021

OP2 is an "active" library framework for the development and solution of unstructured mesh based applications. It aims to decouple the scientific specification of an application from its parallel implementation to achieve code longevity and near-optimal performance through re-targeting the back- end to different multi-core/many-core hardware. This paper presents a summary of a predictive performance analysis and benchmarking study of OP2 on heterogeneous cluster systems. In this work, an industrial representative CFD application written using the OP2 framework is benchmarked during the solution of an unstructured mesh of 1.5M and 26Medges. Benchmark systems include a large-scale CrayXE6 system and an Intel Westmere/InfiniBand cluster. Performance modeling is then used to predict the application's performance on an NVIDIA Tesla C2070 based GPU cluster, enabling the comparison of OP2's performance capabilities on emerging distributed memory heterogeneous systems. Results illustrate the performance benefits that can be gained through many-core solutions both on single-node and heterogeneous configurations in comparison to traditional homogeneous cluster systems for this class of applications.

关键词： Graphics processing unit

来源：评论

学校读者我要写书评

暂无评论

Realistic simulations of strongly correlated systems

Realistic simulations of strongly correlated systems

引用

workshop on Computational Methods in Science and Engineering, SimLabs@KIT 2010

作者： Dolfen, Andreas Koch, Erik German Research School for Simulation Sciences FZ-Jülich RWTH Aachen University 52425 Jülich Germany

ISBN: (纸本)9783866446939

The physics of strongly correlated materials poses one of the most challenging problems in condensedmatter sciences. Standard approximations applicable to wide classes of materials such as the local density approximation fail, due to the importance of the Coulomb repulsion between localized electrons. Instead, we resort to non-perturbative many-body methods. The calculations are, however, only feasible for rather small model systems. The full Hamiltonian of a real material is approximated by a model Hamiltonian comprising only the most important electronic degrees of freedom, while the effect of all other electrons is included in an average way by renormalizing the parameters. Realistic calculations of strongly correlated materials need to include sufficiently many of these electronic degrees of freedom. The new generation of massively parallel supercomputers allows for these realistic calculations. However, exploiting their computational power requires newly devised algorithms. As a solver we use the Lanczos method, which needs the full many-body state of the correlated system. It is thus limited by the available main memory. The foremost problem for a distributed-memory implementation is that the multiplication of the Hamiltonian to the many-body state leads to highly non-local memory access patterns. Our solution to this problem relies on the efficient implementation of MPI collective communication on these systems. We show that the new algorithm scales extremely well on JUGENE, Jülich's Blue Gene/P. The concept underlying this massively parallel implementation is not restricted to correlated electrons but can also be used in simulating quantum spin systems. Moreover, it can also be extended to exploit further levels of parallelization as provided, for instance, by non-conventional processing units such as the Cell Broadband Engine.

关键词： Hamiltonians

来源：评论

学校读者我要写书评

暂无评论

Granular security for a science gateway in structural bioinformatics

Granular security for a science gateway in structural bioinf...

引用

3rd International workshop on Science Gateways for Life Sciences 2011, IWSG-Life 2011

作者： Gesing, Sandra Grunzke, Richard Balaskó, Ákos Birkenheuer, Georg Blunk, Dirk Breuers, Sebastian Brinkmann, André Fels, Gregor Herres-Pawlis, Sonja Kacsuk, Peter Kozlovszky, Miklos Krüger, Jens Packschies, Lars Schäfer, Patrick Schuller, Bernd Schuster, Johannes Steinke, Thomas Szikszay Fabri, Anna Wewior, Martin Müller-Pfefferkorn, Ralph Kohlbacher, Oliver Zentrum Für Bioinformatik Eberhard-Karls-Universität Tübingen Germany Zentrum Für Informationsdienste und Hochleistungsrechnen Technische Universität Dresden Germany MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences Budapest Hungary Paderborn Center for Parallel Computing Universität Paderborn Germany Department Für Chemie Universität zu Köln Germany Department Chemie Universität Paderborn Germany Fakultät Chemie TU Dortmund Germany Regionales Rechenzentrum Universität zu Köln Germany Konrad-Zuse-Institut Für Informationstechnik Berlin Germany Forschungszentrum Jülich Germany

Structural Bioinformatics is concerned with computational methods for the analysis and modeling of three-dimensional molecular structures. There is a plethora of computational tools available to work with structural data on a large scale. Using these tools on distributed computing infrastructures (DCI), however, is often hampered by a lack of suitable interfaces. The MoSGrid (Molecular simulation Grid) science gateway provides an intuitive user interface to several widelyused tools in structural bioinformatics. It ensures the confidentiality, integrity and availability of data via a granular security concept which covers all layers of the infrastructure. The concept applies SAML (Security Assertion Markup Language) and allows trust delegation from the user interface layer across the high-level middleware layer and the grid middleware layer down to the HPC facilities. SAML assertions had to be integrated into the MoSGrid infrastructure in several places: the workflow-enabled grid portal WS-PGRADE, the gUSE (grid User Support Environment) DCI services, and the cloud file system XtreemFS. The security infrastructure presented here allows single sign-on and thus lowers the hurdle for users to utilize large HPC infrastructures for structural bioinformatics. Copyright © 2011 for the individual papers by the papers authors.

关键词： User interfaces

来源：评论

学校读者我要写书评

暂无评论

proceedings of the 16th international workshop on high-level parallel programming models and supportive environments

IEEE International Symposium on Parallel and Distributed Pro...

引用

IEEE International Symposium on parallel and distributed Processing workshops and Phd Forum 2011年 1139-1140页

作者： Hoefler, Torsten Eigenmann, Rudolf Gerndt, Michael Müller, Frank Rasmussen, Craig Schulz, Martin Alam, Sadaf Balaji, Pavan Barrett, Richard Bode, Brett Bosilca, George Bronevetsky, Greg Danalis, Anthony De Supinski, Bronis Ding, Chen Fahringer, Thomas Ishikawa, Yutaka Knüpfer, Andreas Mohr, Bernd Scholz, Sven-Bodo Skjellum, Tony Snir, Marc Tillier, Fabian Larsson, Jesper Willcock, Jeremiah Wolf, Felix Blue Waters Directorate NCSA University of Illinois Urbana-Champaign United States Purdue University United States Technische Universität München Germany North Carolina State University United States Los Alamos National Laboratory United States Lawrence Livermore National Laboratory United States Swiss National Supercomputing Centre Switzerland Argonne National Laboratory United States Sandia National Laboratories United States National Center for Supercomputing Applications United States University of Tennessee Knoxville United States University of Rochester United States University of Innsbruck Austria University of Tokyo Japan Technische Universität Dresden Germany Forschungszentrum Jülich Germany University of Herfordshire United Kingdom University of Alabama Birmingham United States University of Illinois Urbana-Champaign United States Microsoft United States Träff University of Vienna Austria Indiana University United States German Research School for Simulation Sciences Germany

来源：评论

学校读者我要写书评

暂无评论

Reversible parallel discrete-event execution of large-scale epidemic outbreak models 10

Reversible parallel discrete-event execution of large-scale ...

引用

24th Annual workshop on Principles of Advanced and distributed simulation, PADS 2010

作者： Perumalla, Kalyan S. Seal, Sudip K. Oak Ridge National Laboratory Oak Ridge TN United States

ISBN: (纸本)9781424472918

The spatial scale, runtime speed and behavioral detail of epidemic outbreak simulations together require the use of large-scale parallel processing. In this paper, an optimistic parallel discrete event execution of a reaction-diffusion simulation model of epidemic outbreaks is presented, with an implementation over the μsik simulator. Rollback support is achieved with the development of a novel reversible model that combines reverse computation with a small amount of incremental state saving. parallel speedup and other runtime performance metrics of the simulation are tested on a small (8,192-core) Blue Gene / P system, while scalability is demonstrated on 65,536 cores of a large Cray XT5 system. Scenarios representing large population sizes (up to several hundred million individuals in the largest case) are exercised. © 2010 IEEE.

关键词： Population statistics

来源：评论

学校读者我要写书评

暂无评论

Exploring multi-grained parallelism in compute-intensive DEVS simulations 10

Exploring multi-grained parallelism in compute-intensive DEV...

引用

24th Annual workshop on Principles of Advanced and distributed simulation, PADS 2010

作者： Liu, Qi Wainer, Gabriel Department of Systems and Computer Engineering Carleton University Ottawa ON Canada

ISBN: (纸本)9781424472918

We propose a computing technique for efficient parallel simulation of compute-intensive DEVS models on the IBM Cell processor, combining multi-grained parallelism and various optimizations to speed up the event execution. Unlike most existing parallelization strategies, our approach explicitly exploits the massive fine-grained event-level parallelism inherent in the simulation process, while most of the logical processes are virtualized, making the achievable parallelism more deterministic and predictable. Together, the parallelization and optimization strategies produced promising experimental results, accelerating the simulation of a 3D environmental model by a factor of up to 33.06. The proposed methods can also be applied to other multicore and shared-memory architectures. © 2010 IEEE.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

2010 IEEE workshop on Principles of Advanced and distributed simulation, PADS 2010

2010 IEEE Workshop on Principles of Advanced and Distributed...

引用

24th Annual workshop on Principles of Advanced and distributed simulation, PADS 2010

ISBN: (纸本)9781424472918

The proceedings contain 18 papers. The topics discussed include: new approaches to protein functional inference and ligand screening with application to the human kinome;federate fault tolerance in HLA-based simulation;continuous matching algorithm for interest management in distributed virtual environments;a methodology to predict the performance of distributed simulations;optimizing a business process model by using simulation;selecting simulation algorithm portfolios by genetic algorithms;on validation of semantic composability in data-driven simulation;integrative models of the hepatitis C virus infection: modeling wicked problems;functional level hardware simulation with pull-model data flow;flow: a stream processing system simulator;reversible parallel discrete-event execution of large-scale epidemic outbreak models;validation of radio channel models using an anechoic chamber;and explicit spatial scattering for load balancing in conservatively synchronized parallel discrete-event simulations.

关键词：

来源：评论

学校读者我要写书评

暂无评论

P-GAS: parallelizing a cycle-accurate event-driven many-core processor simulator using parallel discrete event simulation 10

P-GAS: Parallelizing a cycle-accurate event-driven many-core...

引用

24th Annual workshop on Principles of Advanced and distributed simulation, PADS 2010

作者： Lv, Huiwei Cheng, Yuan Bai, Lu Chen, Mingyu Fan, Dongrui Sun, Ninghui Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy of Sciences China Graduate School Chinese Academy of Sciences China

ISBN: (纸本)9781424472918

Multi-core processors are commonly available now, but most traditional computer architectural simulators still use single-thread execution. In this paper we use parallel discrete event simulation (PDES) to speedup a cycle-accurate event-driven many-core processor simulator. Evaluation against the sequential version shows that the parallelized one achieves an average speedup of 10.9× (up to 13.6×) running SPLASH-2 kernel on a 16-core host machine, with cycle counter differences of less than 0.1%. Moreover, super-linear speedups are achieved between running 1 thread and 8 threads due to reduced overhead of insert-event-to-queue time and increased cache size in parallel processing. We conclude that PDES could be an attractive option for achieving fast cycle-accurate many-core processor simulations. © 2010 IEEE.

关键词： Simulators

来源：评论

学校读者我要写书评

暂无评论

parallel particle-based reaction diffusion: A GPU implementation

Parallel particle-based reaction diffusion: A GPU implementa...

引用

9th International workshop on parallel and distributed Methods in Verification, PDMC 2010 - Joint with the 2nd International workshop on High-Performance Computational Systems Biology, HiBi 2010

作者： Dematté, Lorenzo Microsoft Research University of Trento Center for Computational and Systems Biology Trento Italy

ISBN: (纸本)9780769542652

Space is a very important aspect in the simulation of biochemical models;recently, the need for simulation algorithms able to cope with space is becoming more and more compelling. Complex and large models of biochemical systems need to deal with the movement of single molecules and particles, taking into consideration localised fluctuations, transportation phenomena and diffusion. A common drawback of spatial models lies in their complexity: models could become very large, and their simulation could be time consuming, especially if we want to capture the systems behaviour in a reliable way using stochastic methods in conjunction with a high spatial resolution. In order to deliver the promise done by systems biology to be able to understand a system as whole, we need to move from sequential to parallel simulation algorithms. In this paper we analyse Smoldyn, a widely diffused algorithm for stochastic simulation of chemical reactions with spatial resolution and single molecule detail, and we propose an alternative, innovative implementation that exploits the parallelism of GPUs. The implementation offers good speedups (up to 130x) and real time, high quality graphics output at almost no performance penalties. © 2010 IEEE.

关键词： Graphics processing unit

来源：评论

学校读者我要写书评

暂无评论

Explicit spatial scattering for load balancing in conservatively synchronized parallel discrete-event simulations 10

Explicit spatial scattering for load balancing in conservati...

引用

24th Annual workshop on Principles of Advanced and distributed simulation, PADS 2010

作者： Thulasidasan, Sunil Kasiviswanathan, Shiva Prasad Eidenbenz, Stephan Romero, Phillip Los Alamos National Laboratory United States

ISBN: (纸本)9781424472918

We re-examine the problem of load balancing in conservatively synchronized parallel, discrete-event simulations executed on high-performance computing clusters, focusing on simulations where computational and messaging load tend to be spatially clustered. Such domains are frequently characterized by the presence of geographic "hot-spots" - regions that generate significantly more simulation events than others. Examples of such domains include simulation of urban regions, transportation networks and networks where interaction between entities is often constrained by physical proximity. Noting that in conservatively synchronized parallel simulations, the speed of execution of the simulation is determined by the slowest ( i.e most heavily loaded) simulation process, we study different partitioning strategies in achieving equitable processor-load distribution in domains with spatially clustered load. In particular, we study the effectiveness of partitioning via spatial scattering to achieve optimal load balance. In this partitioning technique, nearby entities are explicitly assigned to different processors, thereby scattering the load across the cluster. This is motivated by two observations, namely, (i) since load is spatially clustered, spatial scattering should, intuitively, spread the load across the compute cluster, and (ii) in parallel simulations, equitable distribution of CPU load is a greater determinant of execution speed than message passing overhead. Through large-scale simulation experiments - both of abstracted and real simulation models - we observe that scatter partitioning, even with its greatly increased messaging overhead, significantly outperforms more conventional spatial partitioning techniques that seek to reduce messaging overhead. Further, even if hot-spots change over the course of the simulation, if the underlying feature of spatial clustering is retained, load continues to be balanced with spatial scattering leading us to the observation that

关键词： Message passing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：