检索结果-内蒙古大学图书馆

Hands-On Learning: Teaching parallel and distributed computing through Unplugged Activities in Undergraduate CS Courses

Hands-On Learning: Teaching Parallel and Distributed Computi...

引用

2024 Workshops of the International Conference for High Performance computing, Networking, Storage and Analysis, SC Workshops 2024

作者： Dasgupta, Anurag Margapuri, Venkat Shamoun, Simon Taneja, Shubbhi Toups, Matthew Valdosta State University Computer Science and Engineering Technology ValdostaGA31698 United States Villanova University Department of Computing Sciences VillanovaPA19085 United States Hofstra University Department of Computer Science HempsteadNY11549 United States Worcester Polytechnic Institute Department of Computer Science WorcesterMA01609 United States Tulane University Computer Science Department New OrleansLA70118 United States

ISBN: (纸本)9798350355543

The authors present and evaluate an unplugged activity to introduce parallel computing concepts to undergraduate students. Students in five CS classrooms used a deck of playing cards in small groups to consider how parallelization can improve performance and how improvement decreases with increased parallelization. Before and after the activity, students took a short survey about their solution and their ideas about parallelism. The authors carried out this activity in seven courses at five institutions in the 2023-2024 academic year. The results showed that students had an increased appreciation for parallelization and this type of activity. © 2024 IEEE.

关键词： computer science CS CS education hands-on learning parallel distributed computing unplugged

来源：评论

学校读者我要写书评

暂无评论

Making applications faster by asynchronous execution: Slowing down processes or relaxing MPI collectives

引用

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE 2023年第1期148卷 472-487页

作者： Afzal, Ayesha Hager, Georg Markidis, Stefano Wellein, Gerhard Erlangen Natl High Performance Comp Ctr NHR FAU D-91058 Erlangen Germany Friedrich Alexander Univ Erlangen Nurnberg Dept Comp Sci D-91058 Erlangen Germany KTH Royal Inst Technol Dept Comp Sci S-11428 Stockholm Sweden

Comprehending the performance bottlenecks at the core of the intricate hardware-software inter-actions exhibited by highly parallel programs on HPC clusters is crucial. This paper sheds light on the issue of automatically asynchronous MPI communication in memory-bound parallel programs on multicore clusters and how it can be facilitated. For instance, slowing down MPI processes by deliberate injection of delays can improve performance if certain conditions are met. This leads to the counter-intuitive conclusion that noise, independent of its source, is not always detrimental but can be leveraged for performance improvements. We employ phase-space graphs as a new tool to visualize parallel program dynamics. They are useful in spotting certain patterns in parallel execution that will easily go unnoticed with traditional tracing tools. We investigate five different microbenchmarks and applications on different supercomputer platforms: an MPI-augmented STREAM Triad, two implementations of Lattice-Boltzmann fluid solvers (D3Q19 and SPEChpc D2Q37), the LULESH and HPCG proxy applications.& COPY;2023 Elsevier B.V. All rights reserved.

关键词： parallel distributed computing Data analytic techniques MPI collectives Asynchronous MPI execution Resource scalability and bottleneck

来源：评论

学校读者我要写书评

暂无评论

The Role of Idle Waves, Desynchronization, and Bottleneck Evasion in the Performance of parallel Programs

引用

IEEE TRANSACTIONS ON parallel AND distributed SYSTEMS 2023年第2期34卷 623-638页

作者： Afzal, Ayesha Hager, Georg Wellein, Gerhard Friedrich Alexander Univ Erlangen Nurnberg Erlangen Natl High Performance Comp Ctr NHRFAU D-91054 Erlangen Germany Friedrich Alexander Univ Erlangen Nurnberg Dept Comp Sci D-91054 Erlangen Germany

The performance of highly parallel applications on distributed-memory systems is influenced by many factors. Analytic performance modeling techniques aim to provide insight into performance limitations and are often the starting point of optimization efforts. However, coupling analytic models across the system hierarchy (socket, node, network) fails to encompass the intricate interplay between the program code and the hardware, especially when execution and communication bottlenecks are involved. In this paper we investigate the effect of bottleneck evasionand how it can lead to automatic overlap of communication overhead with computation. Bottleneck evasion leads to a gradual loss of the initial bulk-synchronous behavior of a parallel code so that its processes become desynchronized. This occurs most prominently in memory-bound programs, which is why we choose memory-bound benchmark and application codes, specifically an MPI-augmented STREAM Triad, sparse matrix-vector multiplication, and a collective-avoiding Chebyshev filter diagonalization code to demonstrate the consequences of desynchronization on two different supercomputing platforms. We investigate the role of idle waves as possible triggers for desynchronization and show the impact of automatic asynchronous communication for a spectrum of code properties and parameters, such as saturation point, matrix structures, domain decomposition, and communication concurrency. Our findings reveal how eliminating synchronization points (such as collective communication or barriers) precipitates performance improvements that go beyond what can be expected by simply subtracting the overhead of the collective from the overall runtime.

关键词： Bottleneck desynchronization parallel distributed computing performance modeling performance optimization scalability synchronization

来源：评论

学校读者我要写书评

暂无评论

ParaDist-HMM: A parallel distributed Implementation of Hidden Markov Model for Big Data Analytics using Spark

引用

INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS 2021年第4期12卷 289-303页

作者： Sassi, Imad Anter, Samir Bekkhoucha, Abdelkrim Hassan II Univ FSTM Comp Sci Lab LIM Casablanca Morocco

Big Data is an extremely massive amount of heterogeneous and multisource data which often requires fast processing and real time analysis. Solving big data analytics problems needs powerful platforms to handle this enormous mass of data and efficient machine learning algorithms to allow the use of big data full potential. Hidden Markov models are statistical models, rich and widely used in various fields especially for time varying data sequences modeling and analysis. They owe their success to the existence of many efficient and reliable algorithms. In this paper, we present ParaDist-HMM, a parallel distributed implementation of hidden Markov model for modeling and solving big data analytics problems. We describe the development and the implementation of the improved algorithms and we propose a Spark-based approach consisting in a parallel distributed big data architecture in cloud computing environment, to put the proposed algorithms into practice. We evaluated the model on synthetic and real financial data in terms of running time, speedup and prediction quality which is measured by using the accuracy and the root mean square error. Experimental results demonstrate that ParaDist-HMM algorithms outperforms other implementations of hidden Markov models in terms of processing speed, accuracy and therefore in efficiency and effectiveness.

关键词： Big data machine learning Hidden Markov model forward backward baum-welch parallel distributed computing spark cloud computing ParaDist-HMM

来源：评论

学校读者我要写书评

暂无评论

distributed and parallel Ensemble Classification for Big Data Based on Kullback-Leibler Random Sample Partition 20th

Distributed and Parallel Ensemble Classification for Big Dat...

引用

20th International Conference on Algorithms and Architectures for parallel Processing (ICA3PP)

作者： Wei, Chenghao Zhang, Jiyong Valiullin, Timur Cao, Weipeng Wang, Qiang Long, Hao Shenzhen Univ Big Data Inst Coll Comp Sci & Software Engn Shenzhen 518000 Peoples R China Hangzhou Dianzi Univ Sch Automat Hangzhou 311305 Peoples R China Southern Univ Sci & Technol SUSTech Acad Adv Interdisciplinary Studies Shenzhen 518055 Peoples R China

ISBN: (纸本)9783030602451;9783030602444

In this article, we use a Kullback-Leibler random sample partition data model to generate a set of disjoint data blocks, where each block is a good representation of the entire data set. Every random sample partition (RSP) block has a sample distribution function similar to the entire data set. To obtain the statistical measure between them, Kernel Density Estimation (KDE) with a dual-tree recursion data structure is firstly applied to fast estimate the probability density of each block. Then, based on the Kullback-Leibler (KL) divergence measure, we can obtain the statistical similarity between a randomly selected RSP data block and other RSP data blocks. We rank the RSP data blocks according to their divergence values in descending order and choose the first ten for an ensemble classification learning. The classification models are established in parallel for the selected RSP data blocks and the final ensemble classification model is obtained by the weighted voting ensemble strategy. The experiments were conducted by building XGboost model based on those ten blocks in parallel, and we incrementally ensemble them according to their KL values. The testing classification results show that our method can increase the generalization capability of the ensemble classification model. It could reduce the model building time in parallel computation environment by using less than 15% of the entire data, which could also solve the memory constraints of big data analysis.

关键词： Big data analysis Approximate computing Random sample partition Ensemble classification parallel distributed computing

来源：评论

学校读者我要写书评

暂无评论

Efficient parallel viterbi algorithm for big data in a spark cloud computing environment

引用

Procedia Computer Science 2022年 215卷 937-946页

作者： Imad Sassi Oumaima Reda Samir Anter Ahmed Zellou FSTM Hassan II University Casablanca Morocco ENSIAS Mohammed V University Rabat Morocco

The aim of this paper is to present a parallel distributed version of Viterbi algorithm that combines the advantages of Spark, the big data framework, and hidden Markov models to solve the decoding problem for large scale multidimensional data. The scope of the paper includes a review of hidden Markov models, a study of decoding problem, a presentation of related work, and a discussion of previously proposed implementations. The main part of the paper consists of a description of development and implementation of a parallel distributed Viterbi algorithm in a cloud computing environment, followed by a description of evaluation experiments of the presented algorithm. The results showed that the proposed algorithm is faster, with high scalability and no deterioration in forecast accuracy is observed.

关键词： big data Viterbi algorithm parallel distributed computing apache spark cloud computing environment

来源：评论

学校读者我要写书评

暂无评论

Massively parallel Causal Inference of Whole Brain Dynamics at Single Neuron Resolution 26

Massively Parallel Causal Inference of Whole Brain Dynamics ...

引用

26th IEEE International Conference on parallel and distributed Systems (IEEE ICPADS)

作者： Watanakeesuntorn, Wassapon Takahashi, Keichi Ichikawa, Kohei Park, Joseph Sugihara, George Takano, Ryousei Haga, Jason Pao, Gerald M. Nara Inst Sci & Technol Nara Japan US Dept Interior Gainesville FL USA Univ Calif San Diego San Diego CA 92103 USA Natl Inst Adv Ind Sci & Technol Tsukuba Ibaraki Japan Salk Inst Biol Studies 10010 N Torrey Pines Rd La Jolla CA 92037 USA

ISBN: (纸本)9781728190747

Empirical Dynamic Modeling (EDM) is a nonlinear time series causal inference framework. The latest implementation of EDM, cppEDM, has only been used for small datasets due to computational cost. With the growth of data collection capabilities, there is a great need to identify causal relationships in large datasets. We present mpEDM, a parallel distributed implementation of EDM optimized for modern GPU-centric supercomputers. We improve the original algorithm to reduce redundant computation and optimize the implementation to fully utilize hardware resources such as GPUs and SIMD units. As a use case, we run mpEDM on AI Bridging Cloud Infrastructure (ABCI) using datasets of an entire animal brain sampled at single neuron resolution to identify dynamical causation patterns across the brain. mpEDM is 1,530x faster than cppEDM and a dataset containing 101,729 neuron was analyzed in 199 seconds on 512 nodes. This is the largest EDM causal inference achieved to date.

关键词： Empirical Dynamic Modeling Causal Inference parallel distributed computing GPU High-Performance computing Neuroscience

来源：评论

学校读者我要写书评

暂无评论

Performance analysis of synchronous and asynchronous distributed genetic algorithms on multiprocessors

引用

SWARM AND EVOLUTIONARY COMPUTATION 2019年 49卷 147-157页

作者： Abdelhafez, Amr Alba, Enrique Luque, Gabriel Univ Malaga Dept Lenguajes & Ciencias Comp ETS Ingn Informat Campus Teatinos E-29071 Malaga Spain Assiut Univ Fac Sci Assiut 71515 Egypt

Because of their effectiveness and flexibility in finding useful solutions, Genetic Algorithms (GM) are very popular search techniques for solving complex optimization problems in scientific and industrial fields. parallel GM (PGAs), and especially distributed ones have been usually presented as the way to overcome the time-consuming shortcoming of sequential GM. In the case of applying PGAs, we can expect better performance, the reason being the exchange of knowledge during the parallel search process. The resulting distributed search is different compared to what sequential pansnictic GM do, then deserving additional studies. This article presents a performance study of three different PGAs. Moreover, we investigate the effect of synchronizing communications over modern shared-memory multiprocessors. We consider the master-slave model along with synchronous and asynchronous distributed GM (dGAs), presenting their different designs and expected similarities when running in a number of cores ranging from one to 32 cores. The master-slave model showed a competitive numerical effort versus the other dGAs and demonstrated to be able to scale-up well over multiprocessors. We describe how the speed-up and parallel performance of the dGAs is changing as the number of cores enlarges. Results of the island model show that synchronous and asynchronous dGAs have different numerical performances on a multiprocessor, the asynchronous algorithm having a faster execution, thus more attractive for time demanding applications. Our results and statistical analyses help in developing a novel body of knowledge on PGAs running in shared memory multiprocessors (versus overwhelming literature oriented to distributed memory clusters), something useful for researchers, beginners, and final users of these techniques.

关键词： parallel distributed computing Genetic algorithms Synchronization MPI Speed-up

来源：评论

学校读者我要写书评

暂无评论

Teaching on Demand: an HPC Experience

Teaching on Demand: an HPC Experience

引用

Workshop on Education for High Performance computing (EduHPC)

作者： Carratala-Saez, Rocio Iserte, Sergio Catalan, Sandra Univ Jaume 1 Dept Comp Sci & Engn Castellon de La Plana Spain Univ Jaume 1 Dept Mech & Engn Construct Castellon de La Plana Spain Univ Valencia Valencia Spain Barcelona Supercomp Ctr Dept Comp Sci Barcelona Spain

ISBN: (纸本)9781728159751

In this work we present the experience of the course "Build your own supercomputer with Raspberry Pi", offered as a non-mandatory workshop with the purpose of bringing High Performance computing (HPC) closer to bachelor students of Universitat Jaume I (UJI, Spain). The intention of the course is twofold;on the one hand, we target towards increasing the knowledge of Computer Science and Engineering students about the labor performed by the HPC community;on the other hand, we aim to create a personalized experience for each student by fulfilling their curiosity about the topics presented and discussed in the class. In order to evaluate the impact and learning, we analyze two surveys filled out by the students respectively before and after the course, where HPC interest and knowledge are exposed.

关键词： Computational Cluster Undergrad Teaching System Administration parallel distributed computing Raspberry Pi

来源：评论

学校读者我要写书评

暂无评论

Models, Algorithms and Monitoring System of the Technical Condition of the Launch Vehicle "Soyuz-2" at All Stages of Its Life Cycle 7th

引用

7th Computer Science On-Line Conference (CSOC)

作者： Bakhmut, Aleksey D. Alexander, Kljucharjov A. Krylov, Aleksey V. Okhtilev, Michael Yu. Okhtilev, Pavel A. Ustinov, Anton V. Zyanchurin, Alexander E. St Petersburg State Univ Aerosp Instrumentat St Petersburg Russia Russian Acad Sci SPIIRAS St Petersburg Inst Informat & Automat St Petersburg Russia

ISBN: (纸本)9783319911892

A model complex and algorithms for assessing the technical condition (TC) and reliability of the launch vehicle (LV) "Soyuz-2" with the decision support (DS) for managing its life cycle (LC) is considered in the article. On the basis of the analysis of modern problems and the requirements for the efficiency, quality and reliability of the assessment of the LV and the reliability of LV, it was concluded that it is necessary to use the new intelligent information technology (IIT), presented in the article, when designing the automated monitoring systems of the condition and DS for managing the LC LV "Soyuz-2". As a theoretical basis for this technology, the modification of the generalized computational model (GCM) as a knowledge representation model, allowing to build simulation-analytical model-based complexes for monitoring conditions and managing complex organizational and technical objects (COTO), is considered.

关键词： Condition monitoring Complex organizational and technical object Artificial intelligence system Control theory structural dynamics parallel distributed computing distributed computing Intelligent user interface Cognitive image

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：