检索结果-内蒙古大学图书馆

Strip Partitioning for Ant Colony Parallel and Distributed Discrete-event Simulation

Procedia computer Science 2015年 51卷 483-492页

作者： Francisco Borges Albert Gutierrez-Milla Remo Suppi Emilio Luque Department of Computer Architecture & Operating Systems Universitat Autonoma de Barcelona Bellaterra 08193 Barcelona Spain

Data partitioning is one of the main problems in parallel and distributed simulation. Distribution of data over the architecture directly influences the efficiency of the simulation. The partitioning strategy becomes a complex problem because it depends on several factors. In an Individual-oriented Model, for example, the partitioning is related to interactions between the individual and the environment. Therefore, parallel and distributed simulation should dynamically enable the interchange of the partitioning strategy in order to choose the most appropriate partitioning strategy for a specific context. In this paper, we propose a strip partitioning strategy to a spatially dependent problem in Individual-oriented Model applications. This strategy avoids sharing resources, and, as a result, it decreases communication volume among the processes. In addition, we develop an objective function that calculates the best partitioning for a specific configuration and gives the computing cost of each partition, allowing for a computing balance through a mapping policy. The results obtained are supported by statistical analysis and experimentation with an Ant Colony application. As a main contribution, we developed a solution where the partitioning strategy can be chosen dynamically and always returns the lowest total execution time.

关键词： Parallel and distributed simulation Parallel discrete-event simulation High performance distributed simulation Strip Partitioning Individual-oriented Model

来源：评论

学校读者我要写书评

暂无评论

Improving Communication Patterns for Distributed Cluster-based Individual-oriented Fish School Simulations

引用

Procedia computer Science 2013年 18卷 702-711页

作者： Roberto Solar Francisco Borges Remo Suppi Emilio Luque Department of Computer Architecture & Operating Systems Universitat Autonoma de Barcelona Bellaterra 08193 Barcelona Spain

Parallel discrete event simulation (PDES) have shown to be an useful paradigm for simulating complex and large-scale models. An individual-oriented approach allows modelers capture complex emerging global behaviors generated by simple local interaction, like observed in self-organized systems. Usually, this type of simulations are highly expensive in terms of computing and communications. One one hand, we can reduce the computing involved in individual interactions by means of developing a robust partitioning method. On the other hand, we have to be able to efficiently handle a huge number of individuals interacting with other individuals stored in memory of remote processors. In this work we will analyze and compare three communication strategies: synchronous and asynchronous message passing (via MPI) and bulk-synchronous parallel (BSP) for our distributed cluster-based individual-oriented fish school simulator. In this type of simulations, the main contributions of our work are: a) we showed that distributed time-driven simulations do not always improve the performance when using synchronous communication strategies, b) we show asynchronous communications strategies are more efficient. In addition, we have verified that the bulk-synchronous parallel method is a scalable.

关键词： Parallel distributed simulation Individual-oriented models Data clustering Fish schooling High performance distributed simulation

来源：评论

学校读者我要写书评

暂无评论

Optimal Run Length for Discrete-event Distributed Cluster-based Simulations

引用

Procedia computer Science 2014年 29卷 73-83页

作者： Francisco Borges Albert Gutierrez-Milla Remo Suppi Emilio Luque Department of Computer Architecture & Operating Systems Universitat Autonoma de Barcelona Bellaterra08193 Barcelona Spain

In scientific simulations the results generated usually come from a stochastic process. New solutions with the aim of improving these simulations have been proposed, but the problem is how to compare these solutions since the results are not deterministic. Consequently how to guarantee that the output results are statistically trusted. In this work we apply a statistical approach in order to define the transient and steady state in discrete event distributed simulation. We used linear regression and batch method to find the optimal simulation size. As contributions of our work we can enumerate: we have applied and adapted the simple statistical approach in order to define the optimal simulation length; we propose the approximate approach to normal distribution instead of generate replications sufficiently large; and the method can be used in other kind of non-terminating science simulations where the data either have a normal distribution or can be approximated by a normal distribution.

关键词： Parallel and distributed simulation Parallel discrete-event simulation High performance distributed simulation Output analysis Run length Transient state Steady state

来源：评论

学校读者我要写书评

暂无评论

High performance distributed cluster-based individual-oriented fish school simulation

引用

Procedia computer Science 2011年 4卷 76-85页

作者： Roberto Solar Remo Suppi Emilio Luque Department of Computer Architecture & Operating Systems Universitat Autnoma de Barcelona Bellaterra 08193 Barcelona Spain

Individual-oriented simulation allows us to represent the global behavior of a system through local interaction in discrete time steps. As we face up close-to-reality models and large-scale workloads, we focus on turning from traditional approaches towards distributed simulation in order to obtain more accurate results in less time. One of the main problems in distributed simulation is how to distribute individuals efficiently through distributed architecture. Individual-oriented systems can be implemented in a distributed fashion by using either a grid-based or cluster-based approach. On one hand, grid-based approaches consist of assigning to each node a simulation space portion, together with the set of individuals currently residing in that area. On the other hand, cluster-based approaches consist of assigning to each node a fixed set of individuals. In this work we present a cluster-based method based on Voronoi diagrams and covering radius criterion in order to avoid unnecessary interaction between individuals. We can show experimentally that our proposal reduces the communication and computing times significantly increasing simulation efficiency.

关键词： High performance simulation individual-oriented models distributed simulation data clustering nearest-neighbor

来源：评论

学校读者我要写书评

暂无评论

Target encoding for efficient indirect jump prediction

Target encoding for efficient indirect jump prediction

引用

11th International Euro-Par Conference, Euro-Par 2005

作者： Moure, Juan Carlos Benitez, Domingo Rexachs, Dolores Isabel Luque, Emilio Computer Architecture and Operating Systems Department Universidad Autoánoma de Barcelona 08193 Barcelona Spain University of Las Palmas G.C. 35017 Las Palmas Spain

Accurate indirect jump prediction is critical for some applications. Proposed methods are not efficient in terms of chip area. Our proposal evaluates a mechanism called target encoding that provides a better ratio between prediction accuracy and the amount of bits devoted to the predictor. The idea is to encode full target addresses into shorter target identifiers, so that more entries can be stored with a fixed memory budget, and longer branch histories can be used to increase prediction accuracy. With a fixed area budget, the increase in accuracy for the proposed scheme ranges from 10% to up to 90%. On the other hand, the new scheme provides the same accuracy while reducing predictor size by between 35% and 70%. © Springer-Verlag Berlin Heidelberg 2005.

关键词： Encoding (symbols)

来源：评论

学校读者我要写书评

暂无评论

Search of performance inefficiencies in message passing applications with KappaPI 2 tool

Search of performance inefficiencies in message passing appl...

引用

8th International Workshop on Applied Parallel Computing, PARA 2006

作者： Jorba, Josep Margalef, Tomás Luque, Emilio Estudis d'Informatica Multimedia i Telecomunicacio Rambla del Poblenou 156 ES-08018 Barcelona Spain Computer Architecture and Operating Systems Department ES-08193 Bellaterra Spain

ISBN: (纸本)9783540757542

Performance is a crucial issue of parallel/distributed applications. One kind of useful tools, in this context, are the automatic performance analysis tools, that help developers in some of the phases of the performance tuning process. KappaPI 2 is an automatic performance tool, with an open extensible knowledge base about typical inefficiencies in message passing applications, and it is able to detect and analyze these inefficiencies, and then make suggestions to the developer about how to improve their application behavior. © Springer-Verlag Berlin Heidelberg 2007.

关键词： Search engines

来源：评论

学校读者我要写书评

暂无评论

How to Determine the Topology of Hierarchical Tuning Networks for Dynamic Auto-tuning in Large-scale systems

引用

Procedia computer Science 2013年 18卷 1352-1361页

作者： Andrea Martínez Anna Sikora Eduardo César Joan Sorribes Computer Architecture and Operating Systems Department Universitat Auto‘noma de Barcelona 08193 Bellaterra Barcelona Spain

Automatic analysis and tuning is a key strategy that helps to exploit the potential of high performance systems. However, for parallel applications with long running times, dynamic behaviour or highly data dependent performance patterns, it is necessary to make use of the strength of dynamic auto-tuning. An important factor in dynamic auto-tuning on a large scale is the number of additional resources required by the tuning system itself in order to reduce impact on the application performance. A tradeoff must be made between the loss of effectiveness of a tuning system using too few resources and the loss of its efficiency using too many resources. Most automatic analysis or tuning systems do not provide assistance for defining how many additional resources are required. In this work, we address this problem proposing a method focused on calculating the structure of hierarchical tuning networks. The topology will be composed of the minimum number of non-saturated resources. Experimental evaluation performed covers different use cases, each one showing that tuning networks built according to our proposal make efficient use of resources, while providing a high quality analysis and tuning environment.

关键词： performance tools dynamic and automatic analysis dynamic and automatic tuning resource usage efficiency

来源：评论

学校读者我要写书评

暂无评论

Performance-aware Energy Saving Mechanism in Interconnection Networks for Parallel systems

引用

Procedia computer Science 2014年 29卷 134-144页

作者： Hai Nguyen Daniel Franco Emilio Luque Computer Architecture & Operating Systems Department Universitat Autònoma de Barcelona 08193 Bellaterra Barcelona Spain

The growing processing power of parallel computing systems require interconnection networks a higher level of complexity and higher performance, thus consuming more energy. Link components contributes a substantial proportion of the total energy consumption of the networks. Many researchers have proposed approaches to judiciously change the link speed as a function of traffic to save energy when the traffic is light. However, the link speed reduction incurs an increase in average packet latency, thus degrades network performance. This paper addresses that issue with a performance-aware energy saving mechanism. The simulation results show that the proposed mechanism outperforms the energy saving mechanisms in literature.

关键词： energy saving performance-aware interconnection network distributed systems

来源：评论

学校读者我要写书评

暂无评论

Proximity Load Balancing for Distributed Cluster-based Individual-oriented Fish School Simulations

引用

Procedia computer Science 2012年 9卷 328-337页

作者： Roberto Solar Remo Suppi Emilio Luque Department of Computer Architecture & Operating Systems Universitat Autònoma de Barcelona Bellaterra 08193 Barcelona Spain

Partitioning and load balancing are highly important issues in distributed individual-oriented simulation. Choosing how to distribute individuals on the distributed environment can be a crucial factor at the moment of executing the simulation. Partitioning an individual-oriented system should be efficient in order to reduce communication involved in interaction between individuals belong to different logical processes. Furthermore, if the individual-oriented model exhibits mobility patterns, we should be able to maintain the load balancing in order to keep the global application performance. In this work, we present a proximity load balancing strategy for a distributed cluster-based individual-oriented fish school simulator. On one hand, we implement a robust cluster-based partitioning method by means of covering radius criterion and voronoi diagrams. We use a proximity criterion to distribute individuals on the distributed architecture. On the other hand, we propose a proximity load balancing strategy in order to maintain the application performance as the simulation progresses.

关键词： High-performance simulation individual-oriented models distributed simulation data clustering nearest-neighbor load balancing

来源：评论

学校读者我要写书评

暂无评论

PIOM-PX: A framework for modeling the I/O behavior of parallel scientific applications 32nd

PIOM-PX: A framework for modeling the I/O behavior of parall...

引用

32nd International Conference on High Performance Computing, ISC High Performance 2017

作者： Gomez-Sanchez, Pilar Mendez, Sandra Rexachs, Dolores Luque, Emilio Computer Architecture and Operating Systems Department Universitat Autónoma de Barcelona Campus UAB Edifici Q Bellaterra Barcelona08193 Spain Garching bei München85748 Germany

ISBN: (纸本)9783319676296

Current parallel scientific applications generate a huge amount of data that must be managed efficiently for the HPC storage systems. However, the I/O performance depends on the application I/O behavior and the configuration of the underlying I/O system. To understand the I/O behavior in the software stack and its impact on the I/O operations defined in the application logic, we propose a design framework named PIOM-PX, which allows to define an I/O behavior model based on the I/O phases of HPC applications at POSIX-IO level. We validate our framework using the IOR benchmark for four I/O patterns and we analyze the I/O behavior of NAS BT-IO. © Springer International Publishing AG 2017.

关键词： Digital storage

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：