In this paper, we propose built-in functions on parallel programming model in SMYLE OpenCL to extend the original OpenCL semantics giving our system's original limitation and interpretation for embedded many-core ...
详细信息
In this paper, we propose built-in functions on parallel programming model in SMYLE OpenCL to extend the original OpenCL semantics giving our system's original limitation and interpretation for embedded many-core architecture. On a platform using FPGA to evaluate embedded many-core architecture SMYLEref, data parallel and task parallel programming models supported by the OpenCL execution model are implemented. Then, high-level API based on an OpenCL framework, named SMYLE OpenCL, has been developed. Math functions, integer functions, common functions, geometric functions, relational functions, and barrier function of synchronization functions in built-in functions are implemented for OpenCL version 1.2. And the routines for floating point emulation are developed in order to compute the ranges of float and double on the device core which OS is not installed in. This paper describes the design and its implementation of built-in functions on parallel programming model in SMYLE OpenCL, and shows how to solve the implementation issues. The high-speed low-power performance using parallel benchmark applications compared with a former technology is demonstrated.
Traditional programming languages for parallel computer systems do not efficiently separate the algorithm description from the details of its hardware implementation. As a result, a significant code modification is re...
详细信息
The XCS classifier system (XCS) constitutes the most deeply investigated evolutionary rule-based machine learning algorithm. Due to its online learning nature, it is not as computationally intense as deep learning app...
详细信息
The British Geological Survey's global geomagnetic model, Model of the Earth's Magnetic Environment (MEME), is an important tool for calculating the strength and direction of the Earth's magnetic field, wh...
详细信息
Recent technological advances in computer hardware and software industry resulted in a wide range of single- and multi-core processors, operating systems, compilers and applications, with the ultimate goal to increase...
详细信息
We propose a set of building blocks (RISC-pb2l) suitable to build high-level structured parallel programming frameworks. The set is designed following a RISC approach. RISC-pb2l is architecture independent but the imp...
详细信息
We propose a set of building blocks (RISC-pb2l) suitable to build high-level structured parallel programming frameworks. The set is designed following a RISC approach. RISC-pb2l is architecture independent but the implementation of the different blocks may be specialized to make the best usage of the target architecture peculiarities. A number of optimizations may be designed transforming basic building blocks compositions into more efficient compositions, such that parallel application efficiency may be derived by construction rather than by debugging.
This paper introduces an aspect-oriented library aimed to support efficient execution of Java applications on multi-core systems. The library is coded in AspectJ and provides a set of parallel programming abstractions...
详细信息
This paper introduces an aspect-oriented library aimed to support efficient execution of Java applications on multi-core systems. The library is coded in AspectJ and provides a set of parallel programming abstractions that mimics the OpenMP standard. The library supports the migration of sequential Java codes to multi-core machines with minor changes to the base code, intrinsically supports the sequential semantics of OpenMP and provides improved integration with object-oriented mechanisms. The aspect-oriented nature of library enables the encapsulation of parallelism-related code into well-defined modules. The approach makes the parallelisation and the maintenance of large-scale Java applications more manageable. Furthermore, the library can be used with plain Java annotations and can be easily extended with application-specific mechanisms in order to tune application performance. The library has a competitive performance, in comparison with traditional parallel programming in Java, and enhances programmability, since it allows an independent development of parallelism-related code.
Cloud federation emerged to extend the resources available between different interconnected cloud providers for transparent and unlimited availability to the end-user. Cloud orchestration platforms have become a way t...
详细信息
ISBN:
(纸本)9781665429825
Cloud federation emerged to extend the resources available between different interconnected cloud providers for transparent and unlimited availability to the end-user. Cloud orchestration platforms have become a way to centralize demands for high computational power in applications such as Bioinformatics workflows. The large quantity of resources available among several providers in a federation makes it challenging to choose a suitable one for particular workflows. This work proposes a Machine Learning Resource Prediction Service called sPCRAM. sPCRAM uses a machine learning model combined with a GRASP metaheuristic to transparently and adequately dimension the resources, determining the monetary cost and the runtime before the workflow execution. sPCRAM interactively allows the user to set the execution type, calibrate time and cost. Such executions can have, for example, long duration and low cost, as well as a shorter duration and a higher cost. The results demonstrate that sPCRAM can appropriately estimate runtime and cost for cloud federation resources on average 97,70% faster than the brute force technique for resource selection.
In this paper, we propose a text baseline detection method. The proposed method is based on a strategy of object separation in a binary image that consists of three steps. The first step is making a binary image with ...
详细信息
In this paper, we propose a text baseline detection method. The proposed method is based on a strategy of object separation in a binary image that consists of three steps. The first step is making a binary image with sobel edge detection and mathematical morphology operation to take a approximated text area from the ordinary document image. In the second step, line segments which are candidates for text baselines, are extracted by parallel levelset method. The last step fits a line from each segment with parallel random sample consensus and selects appropriate lines automatically. For parallel computation, OpenMP that is standard API for shared memory parallel programming in C/C++ is used.
High-performance embedded computing is developing rapidly since applications in most domains require a large and increasing amount of computing power. On the hardware side, this requirement is met by the introduction ...
详细信息
ISBN:
(纸本)9781728151250
High-performance embedded computing is developing rapidly since applications in most domains require a large and increasing amount of computing power. On the hardware side, this requirement is met by the introduction of heterogeneous systems, with highly parallel accelerators that are designed to take care of the computation-heavy parts of an application. There is today a plethora of accelerator architectures, including GPUs, many-cores, FPGAs, and domain-specific architectures such as AI accelerators. They all have their own programming models, which are typically complex, low-level, and involve explicit parallelism. This yields error-prone software that puts the functional safety at risk, unacceptable for safety-critical embedded applications. In this position paper we argue that high-level executable modelling languages tailored for parallel computing can help in the software design for high performance embedded applications. In particular, we consider the data-parallel model to be a suitable candidate, since it allows very abstract parallel algorithm specifications free from race conditions. Moreover, we promote the Action Language for fUML (and thereby fUML) as suitable host language.
暂无评论