The first experiences of the authors in distributedprocessing of the finite-element method in local area networks (LANs) are described. A procedure for solving finite element problems has been implemented on five DEC...
详细信息
The first experiences of the authors in distributedprocessing of the finite-element method in local area networks (LANs) are described. A procedure for solving finite element problems has been implemented on five DEC computers within such a network. The principal idea is to subdivide the whole problem into substructures and to eliminate the interior nodes of those in parallel autonomous processes by frontal techniques. The aim is to keep the resulting system of equations as small as possible, so that this resulting set of equations can be solved on one processor by ICGG (incomplete choleski conjugate gradient) method in an acceptable time. As another application of distributedprocessing, higher evolution strategies for optimization problems in a finite-element context have been parallelized. It is concluded that substructuring as a means of performing parallelprocessing on a LAN is only of interest for large 2-D problems.< >
A review on the fractal image compression coding and the improvement measures are provided in this paper. The development process of the fractal image compression coding is introduced briefly. Some basic mathematic th...
详细信息
ISBN:
(纸本)9781932415582
A review on the fractal image compression coding and the improvement measures are provided in this paper. The development process of the fractal image compression coding is introduced briefly. Some basic mathematic theories about the fractal image compression coding are given. The main problems which exist in the fractal compression coding bring into focus. Some traditional and new methods about the fractal image compression coding are discussed. The advantages and disadvantages of these methods are comprehensively analy ed. And the mixture of these methods is concerned. It is believed that the fractal image compression coding will be one of the most effective methods in the field of digital image coding.
A divisible load can be arbitrarily divided into independent small load fractions which are assigned to processors in a parallel or distributed computing system for simultaneous processing. The theory and techniques o...
详细信息
A divisible load can be arbitrarily divided into independent small load fractions which are assigned to processors in a parallel or distributed computing system for simultaneous processing. The theory and techniques of divisible load distribution have a wide range of aerospace applications, including satellite signal and imageprocessing, radar and infrared tracking, target identification and searching, and data reporting and aggregation and processing in wireless sensor networks. We make new progress on divisible load distribution on tree and pyramid networks. We revisit the classic method for divisible load distribution on partitionable static interconnection networks (including complete tree and pyramid networks) and derive a closed-form expression of the parallel time and speedup. We propose two new methods which employ pipelined communication techniques to distribute divisible loads on tree and pyramid networks. We derive closed-form expressions of the parallel time and speedup for both methods and show that the asymptotic speedup of both methods is b beta + 1 for a complete b-ary tree network and 4 beta + 1 for a pyramid network, where beta is the ratio of the time for computing a unit load to the time for communicating a unit load. The technique of pipelined communications leads to improved performance of divisible load distribution on tree and pyramid networks. Compared with the classic method, the asymptotic speedup of our new methods is 100% faster on a complete binary tree network and 33% faster on a pyramid network for large beta.
Registration of two or more images of the same scene is an important procedure in InSAR imageprocessing that seeks to extract differential phase information exactly between two images. Meanwhile, the efficiency for l...
详细信息
ISBN:
(纸本)0819455202
Registration of two or more images of the same scene is an important procedure in InSAR imageprocessing that seeks to extract differential phase information exactly between two images. Meanwhile, the efficiency for large volume data processing is also a key point in the operational InSAR data processing chain. In this paper, some conventional registration methods are analyzed in detail and the parallel algorithm for registration is investigated. Combining parallel computing model with the intrinsic properties of InSAR data, the authors puts forward an imageparallel registration scheme over distributed cluster of PCs. The preliminary experiment will be implemented and the result demonstrates feasibility and effectiveness of the proposed scheme.
Least squares prediction is a technique used to foresee pixel values during image coding by finding the minimum square error of neighbouring pixels. It has shown considerable quality gains especially for complex image...
详细信息
ISBN:
(纸本)9781509060580
Least squares prediction is a technique used to foresee pixel values during image coding by finding the minimum square error of neighbouring pixels. It has shown considerable quality gains especially for complex images with high variations in pixel intensities. The drawback of this technique consists of high computational complexity, consuming the most significant part of processing time and resources available, which makes it difficult to implement in fast, lossy image coders. One challenge is therefore to reduce the computational time of this predictor, namely through the use of new parallel programming techniques, making it more attractive for state-of-the-art coder-decoders. Also, new algorithmic propositions are made, trying to reduce the time spent in exchange for rate-distortion performance. These propositions are senseful since this predictor is used not only in lossless image coding, but also in lossy as well. Another aim of this article is to analyze energy efficiency among different types of platforms for this signal processing algorithm. Comparisons are provided on parallel computing processors ranging from very powerful Graphics Computing Units (GPUs) to mobile General-Purpose GPUs.
The paper describes the experimental framework for distributedimageprocessing with the use of multicomputer providing fast development of high-performance remote sensing data processing technologies. Basic principle...
详细信息
ISBN:
(纸本)9783642231773
The paper describes the experimental framework for distributedimageprocessing with the use of multicomputer providing fast development of high-performance remote sensing data processing technologies. Basic principles of system building, some architectural solutions, and sample implementation of concrete processing technologies are given.
We develop a novel approach for computing the circle Hough transform entirely on graphics hardware (GPU). A primary role is assigned to vertex processors and the rasterizer, overshadowing the traditional foreground of...
详细信息
We develop a novel approach for computing the circle Hough transform entirely on graphics hardware (GPU). A primary role is assigned to vertex processors and the rasterizer, overshadowing the traditional foreground of pixel processors and enhancing parallelprocessing. Resources like the vertex cache or blending units are studied too, with our set of optimizations leading to extraordinary peak gain factors exceeding 358x over a typical CPU execution. Software optimizations, like the use of precomputed tables or gradient information and hardware improvements, like hyperthreading and multicores are explored on CPUs as well. Overall, the GPU exhibits better scalability and much greater parallel performance to become a solid alternative for computing the classical circle Hough transform versus those optimal methods run on emerging multicore architectures. (c) 2008 Elsevier Inc. All rights reserved.
Many image-processing algorithms are particularly suited to distributed computing because these images are difficult and time consuming to analyse. Furthermore, existing algorithms contain explicit parallelism, which ...
详细信息
ISBN:
(纸本)0780367154
Many image-processing algorithms are particularly suited to distributed computing because these images are difficult and time consuming to analyse. Furthermore, existing algorithms contain explicit parallelism, which can be efficiently exploited by processing arrays. A good example of an imageprocessing operation is the geometric rotation of a rectangular bitmap. This paper shows how this can be implemented on a distributed system using parallel Virtual Machine, by splitting images into number of parts and sending each to a separate computing node. Each node performs a rotation on its partial image before returning it to the master node to be recombined in a single image. A variety of image sizes and number of distributed computing nodes were used to determine the efficiency of this technique, and whether it offers enough speed improvement to justify its complexity. Whilst rotating large images benefited enormously using this algorithm, small images rotated more slowly than they would have done on a single processor. This is of particular importance in the case of large digital images, which may consist of millions of pixels.
The use of Deep Learning methods have been identified as a key opportunity for enabling processing of extreme-scale scientific datasets. Feeding data into compute nodes equipped with several high-end GPUs at sufficien...
详细信息
ISBN:
(纸本)9781728116440
The use of Deep Learning methods have been identified as a key opportunity for enabling processing of extreme-scale scientific datasets. Feeding data into compute nodes equipped with several high-end GPUs at sufficiently high rate is a known challenge. Facilitating processing of these datasets thus requires the ability to store petabytes of data as well as to access the data with very high bandwidth. In this work, we look at two Deep Learning use cases for cytoarchitectonic brain mapping. These applications are very challenging for the underlying IO system. We present an in depth analysis of their IO requirements and performance. Both applications are limited by the IO performance, as the training processes often have to wait several seconds for new training data. Both applications read random patches from a collection of large HDF5 datasets or TIFF files, which result in many small non-consecutive accesses to the parallel file systems. By using a chunked data format or storing temporally copies of the required patches, the IO performance can be improved significantly. These leads to a decrease of the total runtime of up to 80%.
Digital images are being generated at a phenomenal rate. Currently, the most common method by far for searching digital image databases is based on employing index terms which are entered manually. Content based image...
详细信息
ISBN:
(纸本)1892512459
Digital images are being generated at a phenomenal rate. Currently, the most common method by far for searching digital image databases is based on employing index terms which are entered manually. Content based image retrieval (CM) systems are required to effectively and efficiently use the information that is intrinsically stored in these image databases. However, CBIR on multiple image databases is a slow process that needs performance improvement. In this paper, we present a distributed approach to CBIR, an image retrieval scheme that retrieves images based on segmentation and signature in a distributed environment.
暂无评论