Due to the rapid growth in the multicore and GPU based computing devices, the need to teach parallel computing in CS/CE curriculum has become almost mandatory nowadays. A course on parallel Computing systems (PCS) has...
详细信息
ISBN:
(纸本)9781538655559
Due to the rapid growth in the multicore and GPU based computing devices, the need to teach parallel computing in CS/CE curriculum has become almost mandatory nowadays. A course on parallel Computing systems (PCS) has been designed to provide an understanding of the fundamental principles and engineering trade-offs involved in designing modern parallel computing systems as well as to teach parallel programming techniques necessary to effectively utilize these machines. An activity based learning approach was adopted for teaching the course and several parallel programming paradigms and technologies such OpenMP, MPI, and CUDA have been covered. This course was offered as a required course to graduate students. This paper describes the implementation of the course at Thiagarajar College of engineering. Evaluation of the implementation of the course reveals that for students who have not been exposed to parallel and distributed computing, i) activity based learning results in better knowledge gain compared to the traditional approach, ii) learning OpenMP was much easier than MPI or CUDA, iii) some parallel and distributed Computing (PDC) concepts such as false sharing were harder to grasp compared to basic concepts, and iv) it is essential to introduce parallel computing in the undergraduate curriculum.
software is finding place in deeply embedded systems to large scale distributedsystems of cloud service providers such as Amazon and Google. Due to the concurrent and distributed nature of this software, it is hard t...
详细信息
ISBN:
(纸本)9781509041527
software is finding place in deeply embedded systems to large scale distributedsystems of cloud service providers such as Amazon and Google. Due to the concurrent and distributed nature of this software, it is hard to test for correctness of such systems in a foolproof manner. Explicit state model checking is an approach in which we build a model of the system and specify the properties it should hold. Then we construct a state transition system from the model and check if it satisfies the specified properties. There are two kinds of properties of interest: safety and liveness. In this paper, we focus our attention on safety verification, which involves checking if the states that are generated in the transition system satisfy some predicate formulae specified in the form of assertions. The main problem here is that the number of states in the transition system grows exponentially with the number of bits required to store the state of a model at any given point time. So the available main memory even in a server class machine is not sufficient to model check nontrivial practical models. One approach to address this problem is by using resources from a distributed collection of machines. In this paper, we adopt this approach, by proposing a distributed safety property verification algorithm using the vertex centric programming model.
A parallel computer architecture and a distributedsoftware platform for automation and control of general anesthesia is proposed in this paper The system is a prototype research platform, intended to help on the deve...
详细信息
ISBN:
(纸本)9780769536804
A parallel computer architecture and a distributedsoftware platform for automation and control of general anesthesia is proposed in this paper The system is a prototype research platform, intended to help on the development, simulation and test of new control algorithms for general anesthesia. It must be safe when used in real tests and flexible enough to allow the integration of new software modules. The system is composed by two computers, with the specific tasks of anesthesia control and process supervision. The platform makes use of TANGO, a specialized framework for distributed control systems, which provides software mechanisms useful to fulfill the project requirements. The architecture and the set of mechanisms proposed in this paper provide a high degree of flexibility to research on control algorithms, while ensuring the safeness of the whole procedure.
Every somewhat complex computer system contains bugs. As it is nearly impossible to fix all bugs in the software stack, the only alternative remains is to make the system secure accepting the fact that software is vul...
详细信息
ISBN:
(纸本)9781479955848
Every somewhat complex computer system contains bugs. As it is nearly impossible to fix all bugs in the software stack, the only alternative remains is to make the system secure accepting the fact that software is vulnerable. In this work, a hardware monitor is proposed that checks the correctness of program execution using chained signatures.
Traditional parallel programming models achieve synchronization with error-prone and complex-to-debug constructs such as locks and barriers. Transactional Memory (TM) is a promising new parallel programming abstractio...
详细信息
ISBN:
(纸本)9781424416936
Traditional parallel programming models achieve synchronization with error-prone and complex-to-debug constructs such as locks and barriers. Transactional Memory (TM) is a promising new parallel programming abstraction that replaces conventional locks with critical sections expressed as transactions. Most TM research has focused on single address space parallel machines, leaving the area of distributedsystems unexplored. In this paper we introduce a flexible Java software TM (STM) to enable evaluation and prototyping of TM protocols on clusters. Our STM builds on top of the ProActive framework and has as an underlying transactional engine the state-of-the-art DSTM2. It does not rely on software or hardware distributed shared memory for the execution. This follows the transactional semantics at object granularity level and its feasibility is evaluated with non-trivial TM-specific benchmarks.
Anomaly detection in distributedsystems has been a fertile research area, and a range of anomaly detectors have been proposed for distributedsystems. Unfortunately, there is no systematic quantitative study of the e...
详细信息
ISBN:
(纸本)9781728198705
Anomaly detection in distributedsystems has been a fertile research area, and a range of anomaly detectors have been proposed for distributedsystems. Unfortunately, there is no systematic quantitative study of the efficacy of different anomaly detectors, which is of great importance to reveal the deficiencies of existing anomaly detectors and shed light on future research directions. In this paper, we investigate how various anomaly detectors behave on anomalies of different types and the reasons for the same, by extensively injecting software faults into three widely-used distributedsystems. We use a statement-level fault injection method to observe the anomalies, characterize these anomalies, and analyze the detection results from anomaly detectors of three categories. We find that: (1) the distributedsystems' own error reporting mechanisms are able to report most of the anomalies (from 82.1% to 92.8%) but they incur a high false alarm rate of 26.6%. (2) State-of-the-art anomaly detectors are able to detect the existence of anomalies with 99.08% precision and 90.60% recall, but there is still a long way to go to pinpoint the accurate location of the detected anomalies, and (3) Log-based anomaly detection techniques outperform other anomaly detection techniques, but not for all anomaly types.
While monitoring, instrumented long running parallel applications generate huge amount of instrumentation data. Processing and storing this data incurs overhead, and perturbs the execution. Techniques that eliminates ...
详细信息
In this paper we discuss the runtime support required for the parallelization of unstructured data-parallel applications on nonuniform and adaptive environments. The approach presented is reasonably general and is app...
详细信息
ISBN:
(纸本)0818675829
In this paper we discuss the runtime support required for the parallelization of unstructured data-parallel applications on nonuniform and adaptive environments. The approach presented is reasonably general and is applicable to a wide variety of regular as well as irregular applications. We present performance results for the solution of an unstructured mesh on a cluster of heterogeneous workstations.
暂无评论