parallel and distributedcomputing (PDC) concepts are now required topics for accredited undergraduate computer science programs. However, introducing PDC into the CS curriculum is challenging for several reasons, inc...
详细信息
parallel and distributedcomputing (PDC) concepts are now required topics for accredited undergraduate computer science programs. However, introducing PDC into the CS curriculum is challenging for several reasons, including an instructor's lack of PDC knowledge and difficulties in accessing PDC hardware. This paper addresses both of these challenges by presenting free, interactive, web-based PDC teaching modules using inexpensive Raspberry Pi single board computers (SBCs). Our materials include a free disk image that makes it possible for instructors to build Raspberry Pi clusters in minutes and use our software in a variety of curricular contexts. Our multi-year assessment of these materials on students and faculty members indicates that: (i) our materials increased students' confidence regarding important PDC concepts and motivated them to study PDC further;and (ii) our materials increased faculty members' confidence and preparedness in teaching key PDC concepts at their own institutions.
Checkpoint is defined as a designated place in a program at which normal processing is interrupted specifically to preserve the status information necessary to allow resumption of processing at a later time. Checkpoin...
详细信息
Checkpoint is defined as a designated place in a program at which normal processing is interrupted specifically to preserve the status information necessary to allow resumption of processing at a later time. Checkpointing is the process of saving the status information. This paper surveys the algorithms which have been reported in the literature for checkpointing parallel/distributed systems. It has been observed that most of the algorithms published for checkpointing in message passing systems are based on the seminal article by Chandy and Lamport. A large number of articles have been published in this area by relaxing the assumptions made in this paper and by extending it to minimise the overheads of coordination and context saving. Checkpointing for shared memory systems primarily extend cache coherence protocols to maintain a consistent memory. All of them assume that the main memory is safe for storing the context. Recently algorithms have been published for distributed shared memory systems, which extend the cache coherence protocols used in shared memory systems. They however also include methods for storing the status of distributed memory in stable storage. Most of the algorithms assume that there is no knowledge about the programs being executed. It is however felt that in development of parallel programs the user has to do a fair amount of work in distributing tasks and this information can be effectively used to simplify checkpointing and rollback recovery.
In response to shifts in the hardware foundations of computing, parallel and distributedcomputing (PDC) is now a key piece of the core CS curriculum. For CS educators, the COVID-19 pandemic and the resulting switch t...
详细信息
ISBN:
(纸本)9781665435772
In response to shifts in the hardware foundations of computing, parallel and distributedcomputing (PDC) is now a key piece of the core CS curriculum. For CS educators, the COVID-19 pandemic and the resulting switch to remote-learning add new challenges to the tasks of helping learners understand abstract PDC concepts and equipping them with hands-on practical skills. This paper presents several novel teaching materials for teaching PDC remotely, including: (i) using a Runestone Interactive "virtual" handout to learn how to run OpenMP multithreaded programs on a Raspberry Pi, and (ii) using Google Colab and Jupyter notebooks to run mpi4py instances on remote systems and thus learn about MPI distributed multiprocessing. The authors piloted these strategies during a multi-day faculty development workshop on teaching PDC. Assessment data indicates that the materials greatly aided professional development and preparedness to teach PDC.
Applications like Big Data, Machine Learning, Deep Learning and even other Engineering and Scientific research requires a lot of computing power;making High-Performance computing (HPC) an important field. But access t...
详细信息
ISBN:
(纸本)9781728123349
Applications like Big Data, Machine Learning, Deep Learning and even other Engineering and Scientific research requires a lot of computing power;making High-Performance computing (HPC) an important field. But access to Supercomputers is out of range from the majority. Nowadays Supercomputers are actually clusters of computers usually made-up of commodity hardware. Such clusters are called Beowulf Clusters. The history of which goes back to 1994 when NASA built a Supercomputer by creating a cluster of commodity hardware. In recent times a lot of effort has been done in making HPC Clusters of even single board computers (SBCs). Although the creation of clusters of commodity hardware is possible but is a cumbersome task. Moreover, the maintenance of such systems is also difficult and requires special expertise and time. The concept of cloud is to provide on-demand resources that can be services, platform or even infrastructure and this is done by sharing a big resource pool. Cloud computing has resolved problems like maintenance of hardware and requirement of having expertise in networking etc. An effort is made of bringing concepts from cloud computing to HPC in order to get benefits of cloud. The main target is to create a system which can develop a capability of providing computing power as a service which to further be referred to as Supercomputer as a service. A prototype was made using Raspberry Pi (RPi) 3B and 3B+ Single Board Computers. The reason for using RPi boards was increasing popularity of ARM processors in the field of HPC
暂无评论