Memory forensics uses volatile digital artifacts as evidence about criminal activities. Analyzing captured memory dumps for volatile data requires time and effort. This paper studies the utilization of parallel progra...
详细信息
Multiple Sclerosis (MS) is a neurodegenerative disease that involves a complex sequence of events in distinct spatiotemporal scales for which the cause is not completely understood. The representation of such biologic...
详细信息
In light of previous endeavors and trends in the realm of parallel programming, HPPython emerges as an essential superset that enhances the accessibility of parallel programming for developers, facilitating scalabilit...
详细信息
Turing's model is a model contains reaction-diffusion equation that capable to form skin patterns on an animal. In this paper, Turing's model was investigated, with the model improvisation by Barrio et al. [12...
详细信息
ISBN:
(纸本)9781450397902
Turing's model is a model contains reaction-diffusion equation that capable to form skin patterns on an animal. In this paper, Turing's model was investigated, with the model improvisation by Barrio et al. [12], in parallel programming to shown its speed up impact. The parallel programming managed to speed up the process up to 8.9 times while retaining the quality of the result, compared to traditional programming.
Teaching parallel programming to undergraduate CS students is a challenging task as many of the concepts are highly abstract and difficult to grasp. OpenMP is often used to simplify parallelization of programs by allo...
详细信息
ISBN:
(纸本)9781450390705
Teaching parallel programming to undergraduate CS students is a challenging task as many of the concepts are highly abstract and difficult to grasp. OpenMP is often used to simplify parallelization of programs by allowing one to incrementally parallelize using concise and expressive directives. Unfortunately, OpenMP is not available in Java natively. A basic support of OpenMP-like directives can, however, be obtained in Java using the Pyjama compiler and runtime. I report on my experience introducing parallel programming in Java with Pyjama in a small Data Structures class. The material is presented to students in the form of parallel programming patternlets embedded in an interactive notebook with which students can experiment. Formative and summative assessments of the module's effectiveness are performed. This pilot run of the module yielded mixed results, yet valuable insight was gained regarding possible future approaches.
The Python programming language has established itself as a popular alternative for implementing scientific computing workflows. Its massive adoption across a wide spectrum of disciplines has created a strong communit...
详细信息
ISBN:
(数字)9783031238215
ISBN:
(纸本)9783031238208;9783031238215
The Python programming language has established itself as a popular alternative for implementing scientific computing workflows. Its massive adoption across a wide spectrum of disciplines has created a strong community that develops tools for solving complex problems in science and engineering. In particular, there are several parallel programming libraries for Python codes that target multicore processors. We aim at comparing the performance and scalability of a subset of three popular libraries (Multiprocessing, PyMP, and Torcpy). We use the Particle-in-cell (PIC) method as a benchmark. This method is an attractive option for understanding physical phenomena, specially in plasma physics. A pre-existing PIC code implementation was modified to integrate Multiprocessing, PyMP, and Torcpy. The three tools were tested on a manycore and on a multicore processor by running different problem sizes. The results obtained consistently indicate that PyMP has the best performance, Multiprocessing showed a similar behavior but with longer execution times, and Torcpy did not properly scale when increasing the number of workers. Finally, a just-in-time (JIT) alternative was studied by using Numba, showing execution time reductions of up to 43%.
Developing performant parallel applications for the distributed environment is challenging and requires expertise in both the HPC system and the application domain. We have developed a C++-based framework called APPFI...
详细信息
ISBN:
(数字)9781665488020
ISBN:
(纸本)9781665488020
Developing performant parallel applications for the distributed environment is challenging and requires expertise in both the HPC system and the application domain. We have developed a C++-based framework called APPFIS that hides the system complexities by providing an easy-to-use interface for developing performance portable structured grid-based stencil applications. APPFIS's user interface is hardware agnostic and provides partitioning, code optimization, and automatic communication for stencil applications in distributed HPC environment. In addition, it offers straightforward APIs for utilizing multiple GPU accelerators, shared memory, and node-level parallelizations with automatic optimization for computation and communication overlapping. We have tested the functionality and performance of APPFIS using several applications on three platforms (Stampede2 at Texas Advanced Computing Center, Bridges-2 at Pittsburgh Supercomputing Center, and Summit Supercomputer at Oak Ridge National Laboratory). Experimental results show comparable performance to hand-tuned code with an excellent strong and weak scalability up to 4096 CPUs and 384 GPUs.
The increase demand for processing power has grown over the years, this demand lend to the parallel approach which means linking a bunch of computers together to jointly increase both the speed and efficiency. The par...
详细信息
The NAS parallel Benchmarks (NPB), originally implemented mostly in Fortran, is a consolidated suite containing several benchmarks extracted from Computational Fluid Dynamics (CFD) models. The benchmark suite has impo...
详细信息
The NAS parallel Benchmarks (NPB), originally implemented mostly in Fortran, is a consolidated suite containing several benchmarks extracted from Computational Fluid Dynamics (CFD) models. The benchmark suite has important characteristics such as intensive memory communications, complex data dependencies, different memory access patterns, and hardware components/sub-systems overload. parallel programming APIs, libraries, and frameworks that are written in C++ as well as new optimizations and parallel processing techniques can benefit if NPB is made fully available in this programming language. In this paper we present NPB-CPP, a fully C++ translated version of NPB consisting of all the NPB kernels and pseudo-applications developed using OpenMP, Intel TBB, and FastFlow parallel frameworks for multicores. The design of NPB-CPP leverages the Structured parallel programming methodology (essentially based on parallel design patterns). We show the structure of each benchmark application in terms of composition of few patterns (notably Map and MapReduce constructs) provided by the selected C++ frameworks. The experimental evaluation shows the accuracy of NPB-CPP with respect to the original NPB source code. Furthermore, we carefully evaluate the parallel performance on three multi-core systems (Intel, IBM Power, and AMD) with different C++ compilers (gcc, icc, and clang) by discussing the performance differences in order to give to the researchers useful insights to choose the best parallel programming framework for a given type of problem. (C) 2021 Elsevier B.V. All rights reserved.
This paper presents EasyPAP, an easy-to-use programming environment designed to help students to learn parallel programming. EasyPAPfeatures a wide range of 2D computation kernels that the students are invited to para...
详细信息
This paper presents EasyPAP, an easy-to-use programming environment designed to help students to learn parallel programming. EasyPAPfeatures a wide range of 2D computation kernels that the students are invited to parallelize using Pthreads, OpenMP, OpenCL or MPI. Execution of kernels can be interactively visualized, and powerful monitoring tools allow students to observe both the scheduling of computations and the assignment of 2D tiles to threads/processes. By focusing on algorithms and data distribution, students can experiment with diverse code variants and tune multiple parameters, resulting in richer problem exploration and faster progress towards efficient solutions. We present selected lab assignments which illustrate howEasyPAPimproves the way students explore parallel programming. (C) 2021 Elsevier Inc. All rights reserved.
暂无评论