Recent technological advances in computer hardware and software industry resulted in a wide range of single- and multi-core processors, operating systems, compilers and applications, with the ultimate goal to increase...
详细信息
We propose a set of building blocks (RISC-pb2l) suitable to build high-level structured parallel programming frameworks. The set is designed following a RISC approach. RISC-pb2l is architecture independent but the imp...
详细信息
We propose a set of building blocks (RISC-pb2l) suitable to build high-level structured parallel programming frameworks. The set is designed following a RISC approach. RISC-pb2l is architecture independent but the implementation of the different blocks may be specialized to make the best usage of the target architecture peculiarities. A number of optimizations may be designed transforming basic building blocks compositions into more efficient compositions, such that parallel application efficiency may be derived by construction rather than by debugging.
This paper introduces an aspect-oriented library aimed to support efficient execution of Java applications on multi-core systems. The library is coded in AspectJ and provides a set of parallel programming abstractions...
详细信息
This paper introduces an aspect-oriented library aimed to support efficient execution of Java applications on multi-core systems. The library is coded in AspectJ and provides a set of parallel programming abstractions that mimics the OpenMP standard. The library supports the migration of sequential Java codes to multi-core machines with minor changes to the base code, intrinsically supports the sequential semantics of OpenMP and provides improved integration with object-oriented mechanisms. The aspect-oriented nature of library enables the encapsulation of parallelism-related code into well-defined modules. The approach makes the parallelisation and the maintenance of large-scale Java applications more manageable. Furthermore, the library can be used with plain Java annotations and can be easily extended with application-specific mechanisms in order to tune application performance. The library has a competitive performance, in comparison with traditional parallel programming in Java, and enhances programmability, since it allows an independent development of parallelism-related code.
Cloud federation emerged to extend the resources available between different interconnected cloud providers for transparent and unlimited availability to the end-user. Cloud orchestration platforms have become a way t...
详细信息
ISBN:
(纸本)9781665429825
Cloud federation emerged to extend the resources available between different interconnected cloud providers for transparent and unlimited availability to the end-user. Cloud orchestration platforms have become a way to centralize demands for high computational power in applications such as Bioinformatics workflows. The large quantity of resources available among several providers in a federation makes it challenging to choose a suitable one for particular workflows. This work proposes a Machine Learning Resource Prediction Service called sPCRAM. sPCRAM uses a machine learning model combined with a GRASP metaheuristic to transparently and adequately dimension the resources, determining the monetary cost and the runtime before the workflow execution. sPCRAM interactively allows the user to set the execution type, calibrate time and cost. Such executions can have, for example, long duration and low cost, as well as a shorter duration and a higher cost. The results demonstrate that sPCRAM can appropriately estimate runtime and cost for cloud federation resources on average 97,70% faster than the brute force technique for resource selection.
In this paper, we propose a text baseline detection method. The proposed method is based on a strategy of object separation in a binary image that consists of three steps. The first step is making a binary image with ...
详细信息
In this paper, we propose a text baseline detection method. The proposed method is based on a strategy of object separation in a binary image that consists of three steps. The first step is making a binary image with sobel edge detection and mathematical morphology operation to take a approximated text area from the ordinary document image. In the second step, line segments which are candidates for text baselines, are extracted by parallel levelset method. The last step fits a line from each segment with parallel random sample consensus and selects appropriate lines automatically. For parallel computation, OpenMP that is standard API for shared memory parallel programming in C/C++ is used.
High-performance embedded computing is developing rapidly since applications in most domains require a large and increasing amount of computing power. On the hardware side, this requirement is met by the introduction ...
详细信息
ISBN:
(纸本)9781728151250
High-performance embedded computing is developing rapidly since applications in most domains require a large and increasing amount of computing power. On the hardware side, this requirement is met by the introduction of heterogeneous systems, with highly parallel accelerators that are designed to take care of the computation-heavy parts of an application. There is today a plethora of accelerator architectures, including GPUs, many-cores, FPGAs, and domain-specific architectures such as AI accelerators. They all have their own programming models, which are typically complex, low-level, and involve explicit parallelism. This yields error-prone software that puts the functional safety at risk, unacceptable for safety-critical embedded applications. In this position paper we argue that high-level executable modelling languages tailored for parallel computing can help in the software design for high performance embedded applications. In particular, we consider the data-parallel model to be a suitable candidate, since it allows very abstract parallel algorithm specifications free from race conditions. Moreover, we promote the Action Language for fUML (and thereby fUML) as suitable host language.
With the increase of the search for computational models where the expression of parallelism occurs naturally, some paradigms arise as options for the current generation of computers. In this context, dynamic dataflow...
详细信息
With the increase of the search for computational models where the expression of parallelism occurs naturally, some paradigms arise as options for the current generation of computers. In this context, dynamic dataflow and Gamma-General Abstract Model for Multiset mAnipulation-emerge as interesting computational model choices. In dynamic dataflow model, operations are performed as soon as their associated operands are available, without rely on a Program Counter to dictate the execution order of instructions. The Gamma paradigm is based on a parallel multiset rewriting scheme. It provides a nondeterministic execution model inspired by an abstract chemical machinemetaphor, where operations are formulated as reactions that occur freely among matching elements belonging to the multiset. In this work, equivalence relations between the dynamic dataflow and Gamma paradigms are exposed and explored, while methods to convert from dataflow to Gamma paradigm and vice versa are provided. It is shown that vertices and edges of a dynamic dataflow graph can correspond, respectively, to reactions and multiset elements in the Gamma paradigm. This work provides the scientific community with the possibility of taking profit of both parallel programming models, contributing with a versatility component to researchers and developers.
Modern multi-core servers are powerful enough to process multi-gigabit live packet streams on the network data plane. However, in most cases network programmers must build their applications from scratch, by implement...
详细信息
ISBN:
(纸本)9781728181059
Modern multi-core servers are powerful enough to process multi-gigabit live packet streams on the network data plane. However, in most cases network programmers must build their applications from scratch, by implementing both the interfaces towards the lower hardware level and the proper mechanisms for parallel programming. Data Stream Processing (DaSP) frameworks have recently emerged as promising approaches to overcome the above issues and to let programmers simply focus on the logic of the application to develop. However, DaSP platforms are generally not designed for the networking domain, in terms of both performance and functions. In this paper, we selected the WindFlow DaSP framework and built suitable extensions to attach multiple (accelerated) packet sources of data to it. We then implemented a simple monitoring application on top of WindFlow and carried out stress tests with synthetic and real traffic. The results prove that performance scale linearly with the processing cores so that the application was able to process the whole amount of live data up to nearly 20 Gbps rate.
This research presents some of the critical information required to understand the concept of parallel programming and the implementation of OpenMP in parallel programming. parallelism is the preferred tool for expedi...
详细信息
ISBN:
(纸本)9781665416351
This research presents some of the critical information required to understand the concept of parallel programming and the implementation of OpenMP in parallel programming. parallelism is the preferred tool for expediting an algorithm, as demonstrated by the evolution of computing architectures (multi-core and many-core) towards a greater number of processing cores. The report will focus on OpenMP parallel programming models and further examine its implementation and features. parallel programming OpenMP model is increasingly preferred for its ability to deliver real-time processing, thereby, meeting system requirements performance wise. Furthermore, the study of implementing OpenMP in enhancing the efficiency of 3D discontinuous deformation analysis (3D-DDA) for expansive simulation using parallel block Jacobi (BJ) and Pre-conditioned conjugate gradient (PCG) algorithms. The absence of synchronization of data in parallel programming makes the system more prone to errors in programming since the parallel environment is much more complicated than perceived. The studies performed will highlight how synchronization is managed using OpenMP model. In the field of biometrics, the most important issue faced in DNA sequencing and pattern discovery is locating the longest common subsequence (LCS) among sequences. To identify the LCS of DNA sequences, we will look into the solutions achieved using OpenMP tools based on CPU, that extend major improvements in processing speed, capital, and ubiquity, and the results based on the analysis are discussed.
Efforts to support high performance computing (HPC) applications' requirements in the context of cloud computing have motivated us to design HPC Shelf, a cloud computing services platform to build and deploy large...
详细信息
Efforts to support high performance computing (HPC) applications' requirements in the context of cloud computing have motivated us to design HPC Shelf, a cloud computing services platform to build and deploy large-scale parallel computing systems. We introduce Alite, the contextual contract system of HPC Shelf, to select component implementations according to requirements of the host application, target parallel computing platform characteristics (e.g., clusters and MPPs), quality of service (QoS) properties, and cost restrictions. It is evaluated through a small-scale case study employing two complementary component-based frameworks. The first one aims to represent components that implement linear algebra computations based on the BLAS interface. In turn, the second one aims to represent parallel computing platforms on the IaaS cloud offered by Amazon EC2 Service.
暂无评论