parallel programmers mandate high-level parallel programming tools allowing to reduce the effort of the efficient parallelization of their applications. parallel programming leveraging parallel patterns has recently r...
详细信息
ISBN:
(纸本)9781728116440
parallel programmers mandate high-level parallel programming tools allowing to reduce the effort of the efficient parallelization of their applications. parallel programming leveraging parallel patterns has recently received renovated attention thanks to their clear functional and parallel semantics. In this work, we propose a synergy between the well-known Actors-based programming model and the pattern-based parallelization methodology. We present our preliminary results in that direction, discussing and assessing the implementation of the Map parallel pattern by using an Actor-based software accelerator abstraction that seamlessly integrates within the C++ Actor Framework (ICAF). The results obtained on the Intel Xeon Phi KNL platform demonstrate good performance figures achieved with negligible programming efforts.
We propose an algorithm that is fully parallel and has linear time complexity for soft body simulation that addresses three principal issues;Visual Quality, Performance and Ease of use. It works using the power of pre...
详细信息
ISBN:
(纸本)9781728129334;9781728129327
We propose an algorithm that is fully parallel and has linear time complexity for soft body simulation that addresses three principal issues;Visual Quality, Performance and Ease of use. It works using the power of precomputed collision result look-up data and basic approach of shape matching. Since data driven shape matching approach only uses user generated precomputed collision results, deformation results cannot be unexpected. This creates visual quality and improves ease of use. Also, usage of these look-up data opens ways to improve Performance. In our tests, we achieved direct linear speed up depending on the processor's core count.
Stream processing applications became a representative workload in current computing systems. A significant part of these applications demands parallelism to increase performance. However, programmers are often facing...
详细信息
ISBN:
(纸本)9783030105495;9783030105488
Stream processing applications became a representative workload in current computing systems. A significant part of these applications demands parallelism to increase performance. However, programmers are often facing a trade-off between coding productivity and performance when introducing parallelism. SPar was created for balancing this trade-off to the application programmers by using the C++11 attributes' annotation mechanism. In SPar and other programming frameworks for stream processing applications, the manual definition of the number of replicas to be used for the stream operators is a challenge. In addition to that, low latency is required by several stream processing applications. We noted that explicit latency requirements are poorly considered on the state-of-the-art parallel programming frameworks. Since there is a direct relationship between the number of replicas and the latency of the application, in this work we propose an autonomic and adaptive strategy to choose the proper number of replicas in SPar to address latency constraints. We experimentally evaluated our implemented strategy and demonstrated its effectiveness on a real-world application, demonstrating that our adaptive strategy can provide higher abstraction levels while automatically managing the latency.
Proposed paper presents a new model-based Gaussian clustering method and defines new optimization criteria for model-based clustering, which are used as fitness functions in genetic algorithm. These optimization crite...
详细信息
ISBN:
(数字)9783030166816
ISBN:
(纸本)9783030166816;9783030166809
Proposed paper presents a new model-based Gaussian clustering method and defines new optimization criteria for model-based clustering, which are used as fitness functions in genetic algorithm. These optimization criteria are based on different properties of covariance matrices. The proposed model-based Gaussian clustering method is compared with the well-known K-Means method that is solved by genetic algorithm or by Particle Swarm Optimization method. Our method achieves higher similarity between real classification and computed clustering results on all six presented real-world datasets. Because of the high computational requirements of the used methods we have focused on their parallelization. Due to the chosen parallel computer architecture we have combined both MPI and OpenMP programing interfaces. We show that parallelization of the proposed method is very effective and scalable on many execution units.
We propose Slim Graph: the first programming model and framework for practical lossy graph compression that facilitates high-performance approximate graph processing, storage, and analytics. Slim Graph enables the dev...
详细信息
ISBN:
(数字)9781450362290
ISBN:
(纸本)9781450362290
We propose Slim Graph: the first programming model and framework for practical lossy graph compression that facilitates high-performance approximate graph processing, storage, and analytics. Slim Graph enables the developer to express numerous compression schemes using small and programmable compression kernels that can access and modify local parts of input graphs. Such kernels are executed in parallel by the underlying engine, isolating developers from complexities of parallel programming. Our kernels implement novel graph compression schemes that preserve numerous graph properties, for example connected components, minimum spanning trees, or graph spectra. Finally, Slim Graph uses statistical divergences and other metrics to analyze the accuracy of lossy graph compression. We illustrate both theoretically and empirically that Slim Graph accelerates numerous graph algorithms, reduces storage used by graph datasets, and ensures high accuracy of results. Slim Graph may become the common ground for developing, executing, and analyzing emerging lossy graph compression schemes.
Data races are notorious bugs. They introduce non-determinism in programs behavior, complicate programs semantics, making it challenging to debug parallel programs. To make parallel programming easier, efficient data ...
详细信息
Data races are notorious bugs. They introduce non-determinism in programs behavior, complicate programs semantics, making it challenging to debug parallel programs. To make parallel programming easier, efficient data race detection has been a research topic in the last decades. However, existing data race detectors either sacrifice precision or incur high overhead, limiting their application to real-world applications and scenarios. This dissertation proposes approaches to improve the performance of dynamic data race detection without undermining precision, by identifying and removing metadata redundancy dynamically. This dissertation also explores ways to make it practical to detect data races dynamically for GPU programs, which has a disparate programming and execution model from CPU workloads. Further, this dissertation shows how the structured synchronization model in GPU programs can simplify the algorithm design of data race detection for GPU, and how the unique patterns in GPU workloads enable an efficient implementation of the algorithm, yielding a high-performance dynamic data race detector for GPU programs.
parallel programming methodologies are fundamentally dissimilar to those of conventional programming, and software developers without the requisite skillset often find it difficult to adapt to these new methods. This ...
详细信息
ISBN:
(纸本)9781450371964
parallel programming methodologies are fundamentally dissimilar to those of conventional programming, and software developers without the requisite skillset often find it difficult to adapt to these new methods. This is particularly true for parallel programming in a distributed address space, which is necessary for any meaningful degree of scalability. As such, an approach that combines a more intuitive interface together with excellent performance within the distributed address space model is desired. In this work, we present our initial API design and implementation as well as the underlying algorithms for a collective communication library built for the Extended Base Global Address Space (xBGAS) extension to the RISC-V microarchitecture. Our runtime library is designed to enact the Partitioned Global Address Space model (PGAS) in an attempt to alleviate the difficulty associated with traditional distributed address space programming while the underlying collective implementation is formulated to prevent the loss of, and even improve, performance over traditional solutions.
This paper proposes a series of results over the image processing algorithms and for the shortest path algorithm that influence the percentage of CPU utilization. Starting with the fractal generation to image analysis...
详细信息
ISBN:
(纸本)9781728107011
This paper proposes a series of results over the image processing algorithms and for the shortest path algorithm that influence the percentage of CPU utilization. Starting with the fractal generation to image analysis, compression for big images and also any algorithm, the parallel programing is today one of the most important factor for low consumption and good power balanced for the processing unit. A good balance between the processing capacity, the algorithm and the time of processing is hard to obtain due to the number of calculus. In this work a series of results and comparison between sequential and parallel algorithms are presented. Also, there are presented test scenarios in different cases that confirm or infirm the parallel strategy chosen. This paper will integrate the results obtained with the proposed algorithms reports or comparisons of the proposed parallelization methods and will provide conclusions and ideas for future research in this field with extensive uses.
NASA Technical Reports Server (Ntrs) 20050210018: Enabling Requirements-Based programming for Highly-Dependable Complex parallel and Distributed Systems by NASA Technical Reports Server (Ntrs); published by
NASA Technical Reports Server (Ntrs) 20050210018: Enabling Requirements-Based programming for Highly-Dependable Complex parallel and Distributed Systems by NASA Technical Reports Server (Ntrs); published by
NASA Technical Reports Server (Ntrs) 19850022344: the Blaze Language: a parallel Language for Scientific programming by NASA Technical Reports Server (Ntrs); published by
NASA Technical Reports Server (Ntrs) 19850022344: the Blaze Language: a parallel Language for Scientific programming by NASA Technical Reports Server (Ntrs); published by
暂无评论