There is no doubt that data compression is very important in computer engineering. However, most lossless data compression and decompression algorithms are very hard to parallelize, because they use dictionaries updat...
详细信息
ISBN:
(纸本)9783319495835;9783319495828
There is no doubt that data compression is very important in computer engineering. However, most lossless data compression and decompression algorithms are very hard to parallelize, because they use dictionaries updated sequentially. The main contribution of this paper is to present a new lossless data compression method that we call Light Loss-Less (LLL) compression. It is designed so that decompression can be highly parallelized and run very efficiently on the GPU. This makes sense for many applications in which compressed data is read and decompressed many times and decompression performed more frequently than compression. We show optimal sequential and parallel algorithms for LLL decompression and implement them to run on Core i7-4790 CPU and GeForce GTX 1080 GPU, respectively. To show the potentiality of LLL compression method, we have evaluated the running time using five images and compared with well-known compression methods LZW and LZSS. Our GPU implementation of LLL decompression runs 91.1-176 times faster than the CPU implementation. Also, the running time on the GPU of our experiments show that LLL decompression is 2.49-9.13 times faster than LZW decompression and 4.30-14.1 times faster that LZSS decompression, although their compression ratios are comparable.
Data streams are unbounded and infinite flows of data arriving at high rates which cannot be stored for offline processing. Because of this, classical approaches for Data Mining cannot be used straightforwardly in dat...
详细信息
ISBN:
(纸本)9783319393155;9783319393148
Data streams are unbounded and infinite flows of data arriving at high rates which cannot be stored for offline processing. Because of this, classical approaches for Data Mining cannot be used straightforwardly in data stream scenario. This paper introduces a single-pass hardware-based algorithm for frequent itemsets mining on data streams that uses the top-k frequent 1-itemsets. Experimental results of the hardware implementation of the proposed algorithm are also presented and discussed.
The rapidly growing field of parallel computing systems promotes the study of parallel algorithms, with the Monte Carlo method and asynchronous iterations being among the most valuable types. These algorithms have a n...
详细信息
The rapidly growing field of parallel computing systems promotes the study of parallel algorithms, with the Monte Carlo method and asynchronous iterations being among the most valuable types. These algorithms have a number of advantages. There is no need for a global time in a parallel system (no need for synchronization), and all computational resources are efficiently loaded (the minimum processor idle time). The method of partial synchronization of iterations for systems of equations was proposed by the authors earlier. In this article, this method is generalized to include the case of nonlinear equations of the form x = F(x), where x is an unknown column vector of length n, and F is an operator from R-n into R-n. We consider operators that do not satisfy conditions that are sufficient for the convergence of asynchronous iterations, with simple iterations still converging. In this case, one can specify such an incidence of the operator and such properties of the parallel system that asynchronous iterations fail to converge. Partial synchronization is one of the effective ways to solve this problem. An algorithm is proposed that guarantees the convergence of asynchronous iterations and the Monte Carlo method for the above class of operators. The rate of convergence of the algorithm is estimated. The results can prove useful for solving high-dimensional problems on multiprocessor computational systems.
For investigations of rapidly moving structures in opaque technical devices ultrafast electron beam X-ray computed tomography (CT) scanners are available at the Helmholtz-Zentrum Dresden-Rossendorf (HZDR). Currently, ...
详细信息
ISBN:
(纸本)9781509038725
For investigations of rapidly moving structures in opaque technical devices ultrafast electron beam X-ray computed tomography (CT) scanners are available at the Helmholtz-Zentrum Dresden-Rossendorf (HZDR). Currently, measurement data must be initially downloaded after each CT scan from the scanner to a data processing machine. Afterwards, cross-sectional images are reconstructed. This limits the application fields of the scanners. For online observations and even automated process control of scanned objects a new modular data processing tool is presented consisting of user-definable pipeline stages that work independently together in a so called data processing pipeline that can keep up with the CT scanner's frame rate of up to 8 kHz. The data processing stages are arbitrarily programmable and combinable and are connected by a fast custom memory pool to optimize data transfer processes. As a result, this processing structure is not limited to CT application only. In order to achieve highest processing performances for the electron beam X-ray CT scanners all relevant data processing steps are individually implemented in separate stages using graphic processing units (GPUs) and NVIDIA's CUDA programming language. Data processing performance tests on two different high-end GPUs (Tesla K20c, GeForce GTX 1080) offer a slice image reconstruction performance that is well-suited for online application.
The problems of mathematical modeling of two-phase flows in porous media, and in particular, the simulation of oil recovery processes, are considered. An economical numerical algorithm based on the kinetic approach wi...
详细信息
This paper presents the numerical simulation of the circumfluence of a recovery apparatus with brake engines and a detachable heat protection shield carried out by the use of the conservative numerical method of clust...
详细信息
The aim of work is developing the technology representing a complex approach for studding geophysical objects with complex subsurface geometry on the basis of numerical modeling of seismic filed from point sources. An...
详细信息
ISBN:
(纸本)9781509040698
The aim of work is developing the technology representing a complex approach for studding geophysical objects with complex subsurface geometry on the basis of numerical modeling of seismic filed from point sources. An important stage of successful solution of dynamic problem of the theory of elasticity is to develop the model representing the object under study in details and carrying out a series of calculations of elastic wave propagation in inhomogeneous media. We present a programs software for solving the forward geophysical tasks using grid methods. Particular attention is paid to the software interface that allows you to carry out the preparation of geophysical models for theoretical experiments. The developing software for simulation is designed for usage on modern high-performance computing systems. Information and analytical set of programs can be used in the interpretation of experimental data, in design and verification of 2D and 2.5D models when compare experimental and theoretical results. Studying the structure of the Baikal rift zone is one of the geophysical tasks where 2D modeling is necessary.
A software for distributed neural network training is introduced here. The introduced software named NeuralGenesis implements a client - server model for parallel genetic algorithms with custom features such as: an en...
详细信息
ISBN:
(纸本)9781509040865
A software for distributed neural network training is introduced here. The introduced software named NeuralGenesis implements a client - server model for parallel genetic algorithms with custom features such as: an enhanced stopping rule, an advanced mutation scheme and periodical application of a local search procedure. The software is coded in Qt5 for portability reasons and it is freely available for the majority of operating system.
An essential ingredient for the discretization and numerical solution of coupled multiphysics or multiscale problems is stable and efficient techniques for the transfer of discrete fields between nonmatching volume or...
详细信息
An essential ingredient for the discretization and numerical solution of coupled multiphysics or multiscale problems is stable and efficient techniques for the transfer of discrete fields between nonmatching volume or surface meshes. Here, we present and investigate a new and completely parallel approach. It allows for the transfer of discrete fields between unstructured volume and surface meshes, which can be arbitrarily distributed among different processors. No a priori information on the relation between the different meshes is required. Our inherently parallel approach is general in the sense that it can deal with both classical interpolation and variational transfer operators, e.g., the L-2-projection and the pseudo-L-2-projection. It includes a parallel search strategy, output dependent load-balancing, and the computation of element intersections, as well as the parallel assembling of the algebraic representation of the respective transfer operator. We describe our algorithmic framework and its implementation in the library MOONOLITH. Furthermore, we investigate the efficiency and parallel scalability of our new approach using different examples in three dimensions. This includes the computation of a volume transfer operator between 2 meshes with 2 billion elements in total and the computation of a surface transfer operator between 14 different meshes with 5.9 billion elements in total. The experiments have been performed with up to 12,288 cores.
Recently, bio-inspired metaheuristic algorithms have been widely used as powerful optimization tools to estimate crucial parameters of photovoltaic (PV) models. However, the computational cost involved in terms of the...
详细信息
Recently, bio-inspired metaheuristic algorithms have been widely used as powerful optimization tools to estimate crucial parameters of photovoltaic (PV) models. However, the computational cost involved in terms of the time increases as data size or the complexity of the applied PV electrical model increases. Hence, to overcome these limitations, this paper presents the parallel particle swarm optimization (PPSO) algorithm implemented in Open Computing Language (OpenCL) to solve the parameter estimation problem for a wide range of PV models. Experimental and simulation results demonstrate that the PPSO algorithm not only has the capability of obtaining all the parameters with extremely high accuracy but also dramatically improves the computational speed. This is possible and is shown in this work via the inherent capabilities of the parallel processing framework. Copyright (C) 2015 John Wiley & Sons, Ltd.
暂无评论