检索结果-内蒙古大学图书馆

Machine Learning to Design an Auto-tuning System for the Best Compressed Format Detection for Parallel Sparse Computations

引用

PARALLEL PROCESSING LETTERS 2021年第4期31卷 p1-37页

作者： Hamdi-Larbi, Olfa Mehrez, Ichrak Dufaud, Thomas Univ Tunis El Manar Fac Sci Tunis URAPOP Tunis 2092 Tunisia Taibah Univ MIS Dept Coll Business Madinah Saudi Arabia Univ Paris Saclay Univ Versailles St Quentin Lab Li Parad F-78280 Guyancourt France CEA Saclay Maison Simulat F-91191 Gif Sur Yvette France

Many applications in scientific computing process very large sparse matrices on parallel architectures. The presented work in this paper is a part of a project where our general aim is to develop an auto-tuner system for the selection of the best matrix compression format in the context of high-performance computing. The target smart system can automatically select the best compression format for a given sparse matrix, a numerical method processing this matrix, a parallel programming model and a target architecture. Hence, this paper describes the design and implementation of the proposed concept. We consider a case study consisting of a numerical method reduced to the sparse matrix vector product (SpMV), some compression formats, the data parallel as a programming model and, a distributed multi-core platform as a target architecture. This study allows extracting a set of important novel metrics and parameters which are relative to the considered programming model. Our metrics are used as input to a machine-learning algorithm to predict the best matrix compression format. An experimental study targeting a distributed multi-core platform and processing random and real-world matrices shows that our system can improve in average up to 7% the accuracy of the machine learning.

关键词： Design and optimization for HPC systems smart system grid and cluster computing multi-core architectures and support large scale scientific computing data parallel model sparse matrix computation

来源：评论

学校读者我要写书评

暂无评论

Parallel stochastic simulations with rigorous distribution of pseudo-random numbers with DistMe: Application to life science simulations

引用

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE 2012年第7期24卷 723-738页

作者： Reuillon, Romain Traore, Mamadou K. Passerat-Palmbach, Jonathan Hill, David R. C. Clermont Univ LIMOS F-63000 Clermont Ferrand France Univ Blaise Pascal LIMOS F-63000 Clermont Ferrand France CNRS French Natl Ctr Res UMR LIMOS 6158 F-63173 Aubiere France ISIMA F-63173 Aubiere France

This paper presents an open source toolkit allowing a rigorous distribution of stochastic simulations. It is designed according to the state of the art in pseudo-random numbers partitioning techniques. Based on a generic XML format for saving pseudo-random number generator states, each state contains adapted metadata. This toolkit named DistMe is usable by modelers who are non-specialists in parallelizing stochastic simulations, it helps in distributing the replications and in the generation of experimental plans. It automatically writes ready for runtime scripts for various parallel platforms, encapsulating the burden linked to the management of status files for different pseudo-random generators. The automation of this task avoids many human mistakes. The toolkit has been designed based on a model driven engineering approach: the user builds a model of its simulation and the toolkit helps in distributing independent stochastic experiments. In this paper, the toolkit architecture is exposed, and two examples in life science research domains are detailed. The preliminary design of the DistMe toolkit was achieved when dealing with the distribution of a nuclear medicine application using the largest European computing grid: European grid Initiative (EGI). Thanks to our alpha version of the software toolbox, the equivalent of 3?years of computing was achieved in a few days. Next, we present the second application in another domain to show the potential and genericity of the DistMe toolkit. A small experimental plan with 1024 distributed stochastic experiments was run on a local computing cluster to explore scenarios of an environmental application. For both applications, the proposed toolkit was able to automatically generate distribution scripts with independent pseudo-random number streams, and it also automatically parameterized the simulation input files to follow an experimental design. The automatic generation of scripts and input files is achieved, thanks to mod

关键词： distributed stochastic simulation parallel random number generator grid and cluster computing parameter exploration

来源：评论

学校读者我要写书评

暂无评论

RUBLX: A Ruby-based batch language for Xgrid

RUBLX: A Ruby-based batch language for Xgrid

引用

21st European Conference on Modelling and Simulation

作者： Suzuki, Tetsuya Hamano, Kiyoto Shibaura Inst Technol Dept Elect Informat Syst Saitama City Saitama 3378570 Japan

ISBN: (纸本)9780955301827

We present a Ruby-based batch language for Xgrid and its processor. Xgrid is an environment for distributed and parallel computing on the Mac OS X operating system, and Ruby is an object-oriented programming language for general purposes. In the standard Xgrid environment, jobs in batch les are statically de ned by an XML-based language, and submitted jobs are managed by their ID numbers. It is not easy for human to read and write XML-based batch les and to manage jobs by ID numbers. In our approach, jobs in batch les can be dynamically defined by a Ruby-based language, and submitted jobs can be managed by their logical names. Semantic checks and consistency managements are also done at submission in our approach. Our approach syntactically and semantically makes it easy to use Xgrid.

关键词： grid and cluster computing languages

来源：评论

学校读者我要写书评

暂无评论

A simulation study of scalable broadcast in high-performance regular networks

引用

SIMULATION-TRANSACTIONS OF THE SOCIETY FOR MODELING AND SIMULATION INTERNATIONAL 2004年第4-5期80卷 207-220页

作者： Al-Dubai, AY Ould-Khaoua, M Obaidat, MS Univ Glasgow Dept Comp Sci Glasgow G12 8RZ Lanark Scotland Monmouth Coll Dept Comp Sci Long Branch NJ 07764 USA

Broadcast is an important communication operation required by many real-world applications encountered in parallel, cluster, and grid computing environments. Broadcasting on regular networks has been widely investigated in the past. However, most of the existing algorithms handle broadcast in a sequential manner and do not scale well;as a consequence, many applications cannot be efficiently supported using existing algorithms. In an effort to avoid this limitation, this article presents a new broadcast algorithm based on coded path routing. In addition to its simplicity, the proposed algorithm has shown to be capable of performing the broadcast operation in a fixed number of message-passing steps, irrespective of the network size. An extensive simulation study has been conducted to evaluate the performance of the proposed algorithm under different traffic working conditions. The analysis reveals that the new algorithm exhibits superior performance characteristics over those of the well-known recursive-doubling and extended-dominating node algorithms.

关键词： collective communication regular networks simulation grid and cluster computing performance analysis multicast latency

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：