Data Science applications represent a growing fraction of the scientific computing workload, many of them written in Python. the goal of this paper is to compare two popular parallel programming models, namely MPI and...
详细信息
ISBN:
(纸本)9781728174457
Data Science applications represent a growing fraction of the scientific computing workload, many of them written in Python. the goal of this paper is to compare two popular parallel programming models, namely MPI and Apache Spark for Python based Data Science applications. the paper presents communication and file I/O microbenchmarks to evaluate the MPI support for Python applications, and uses two applications use-cases from Natural Language processing to compare the performance of the MPI and the Spark versions. Our results indicate that the MPI version shows better scalability and performance than the PySpark version of the code. On the other hand, the MPI applications are significantly larger than their PySpark counterparts, and took significantly longer to develop due to the necessity to implement some of the built-in functionality provided by Spark.
Skeleton-based libraries are considered as one of the alternatives for reducing the distance between end users and parallel architectures. We propose a general development methodology that allows for the automatic der...
详细信息
ISBN:
(纸本)9783540680673
Skeleton-based libraries are considered as one of the alternatives for reducing the distance between end users and parallel architectures. We propose a general development methodology that allows for the automatic derivation of parallel programs assuming the existence of general structures as the skeletons. We propose the introduction of a new, high level abstraction layer that allows the user to extract problem specifications from particular skeleton languages or libraries. the result is a tool that allows for the generation of parallel codes from successive transformations to this high level specification without any loss of efficiency. We apply the technique to the automatic generation of parallel programs for Dynamic Programming Problems.
Cloud computing allows Web application owners to host their applications with low operational cost and enables them to scale applications to maintain performance based on traffic load and application resource requirem...
详细信息
ISBN:
(纸本)9780769546766
Cloud computing allows Web application owners to host their applications with low operational cost and enables them to scale applications to maintain performance based on traffic load and application resource requirements. However, for multi-tier Web applications, it is difficult to automatically identify the exact location of a bottleneck and scale the appropriate resource tier accordingly because multi-tier applications are complex and bottleneck patterns may be dependent on the specific pattern of workload at any given time. this Ph.D. dissertation aims to explore the possibilities to satisfy response time guarantees for multi-tier applications hosted on clouds using adaptive resource management with minimal hardware profiling and application-centric knowledge.
the hash join operator is one of the most important relational operators in database applications and a prominent research topic in the domain of parallelprocessing. However, up to date, no consistent algorithm desig...
详细信息
ISBN:
(纸本)9781538637906
the hash join operator is one of the most important relational operators in database applications and a prominent research topic in the domain of parallelprocessing. However, up to date, no consistent algorithm design guidelines for high-performance implementations on parallel platforms have been derived from the available experimental results. In this work we define a taxonomy of the parallel hash join operator landscape and categorize state of the art research accordingly. Moreover, we implement and benchmark three taxonomy types: A sequential implementation on the CPU, a hybrid CPU-GPU implementation as well as a fully parallel version on the GPU. the results show that (1) the hybrid CPU-GPU type outperforms the other two, showcasing the benefits of a good fit between algorithm type and hardware platform choice, (2) the poor end-to-end performance of the GPU-only type highlights the impact of GPU specific synchronization and contention issues that appear with an unfit design choice, (3) parallelization improves runtime by a factor of 2.2X in the end-to-end algorithm, a factor of 83X in the join phase and shows good scaling behavior with increasing number of threads. this proves that the GPU is a valuable co-processor option for computation offloading in database applications. We anticipate this classification framework to be a starting-point for design decisions for parallel big data hash join operators on other heterogeneous systems.
Effectiveness of parallell/distributed application control based on predicates defined on application global states depends on the performance of an underlying Consistent Global State monitoring mechanism. Focusing on...
详细信息
ISBN:
(纸本)0769524346
Effectiveness of parallell/distributed application control based on predicates defined on application global states depends on the performance of an underlying Consistent Global State monitoring mechanism. Focusing on Strongly Consistent Global States (SCGS) usage, we introduce three measures of SCGS monitoring quality. they reflect delay, state information ageing and state inspection frequency experienced by a monitor. Using the measures, four SCGS monitoring algorithms are compared, including two novel algorithm variants. the comparison is carried out withthe use of simulations. the newly introduced algorithms prove to perform much better than a standard SCGS algorithm.
Intrusion prevention system is the most important and popular tool in information security. It has been widely used to identify potential threats and respond to them swiftly. However, the existing IPS based on regular...
详细信息
ISBN:
(纸本)9781538637906
Intrusion prevention system is the most important and popular tool in information security. It has been widely used to identify potential threats and respond to them swiftly. However, the existing IPS based on regular expression matching is time consuming and do not support large data well, since it scans and processes input linearly. To enhance conventional approach's parallelism and improve its efficiency for processing problems with large-scale data, this paper raised a speculative parallel intrusion prevention system based on Apache Spark. In the proposed system, the input packets are speculatively divided into several chunks, and then distributed into Apache Spark for parallelprocessing. After processing, the results are collected and evaluated to eliminate incorrect speculations. Experiments show that for large-scale data, by comparing withthe conventional approach, the proposed system could markedly shorten the execution time for intrusion prevention. It can be proved that by adopting our novel system, the efficiency of intrusion prevention can be significantly enhanced.
We present a static parallel implementation of the multifrontal method to solve unsymmetric sparse linear systems on distributed-memory architectures. We target Finite Element (FE) applications where numerical pivotin...
详细信息
ISBN:
(纸本)9783540680673
We present a static parallel implementation of the multifrontal method to solve unsymmetric sparse linear systems on distributed-memory architectures. We target Finite Element (FE) applications where numerical pivoting can be avoided, since an implicit minimum-degree ordering based on the FE mesh topology suffices to achieve numerical stability. Our strategy is static in the sense that work distribution and communication patterns are determined in a preprocessing phase preceding the actual numerical computation. To balance the load among the processors, we devise a simple model-driven partitioning strategy to precompute a high-quality balancing for a large family of structured meshes. the resulting approach is proved to be considerably more efficient than the strategies implemented by MUMPS and SuperLU_DIST, two state-of-the-art parallel multifrontal solvers.
Bit-reproducibility has many advantages in the context of high-performance computing. Besides simplifying and making more accurate the process of debugging and testing the code, it can allow the deployment of applicat...
详细信息
ISBN:
(纸本)9780769552071
Bit-reproducibility has many advantages in the context of high-performance computing. Besides simplifying and making more accurate the process of debugging and testing the code, it can allow the deployment of applications on heterogeneous systems, maintaining the consistency of the computations. In this work we analyze the basic operations performed by scientific applications and identify the possible sources of non-reproducibility. In particular, we consider the tasks of evaluating transcendental functions and performing reductions using non-associative operators. We present a set of techniques to achieve reproducibility and we propose improvements over existing algorithms to perform reproducible computations in a portable way, at the same time obtaining good performance and accuracy. By applying these techniques to more complex tasks we show that bit-reproducibility can be achieved on a broad range of scientific applications.
Using parallel Geographic Image processing System, the flooding disaster will be monitoring and evaluating in time. Using ParGIP to establish background database and process RS images, we can get the losses of the dis...
详细信息
ISBN:
(纸本)0780378407
Using parallel Geographic Image processing System, the flooding disaster will be monitoring and evaluating in time. Using ParGIP to establish background database and process RS images, we can get the losses of the disaster by overlaying operation in 24 hours. According to the experiment in the Poyang Lake region, this method can promote the speed and the efficiency of the monitoring and evaluating of flooding disaster to several times.
parallel design patterns have been developed to help programmers efficiently design and implement parallelapplications. However, identifying a suitable parallel pattern for a specific code region in a sequential appl...
详细信息
ISBN:
(纸本)9781509021406
parallel design patterns have been developed to help programmers efficiently design and implement parallelapplications. However, identifying a suitable parallel pattern for a specific code region in a sequential application is a difficult task. Transforming an application according to support structures applicable to these parallel patterns is also very challenging. In this paper, we present a novel approach to automatically find parallel patterns in the algorithm structure design space of sequential applications. In our approach, we classify code blocks in a region according to the appropriate support structure of the detected pattern. this classification eases the transformation of a sequential application into its parallel version. We evaluated our approach on 17 applications from four different benchmark suites. Our method identified suitable algorithm structure patterns in the sequential applications. We confirmed our results by comparing them withthe existing parallel versions of these applications. We also implemented the patterns we detected in cases in which parallel implementations were not available and achieved speedups of up to 14x.
暂无评论