distributed computing provides a cost-effective solution for computation intensive problems. With the emerging of networking operating system for personal computer (PC), such as WindowsNT, it is now feasible to develo...
详细信息
ISBN:
(纸本)0819425885
distributed computing provides a cost-effective solution for computation intensive problems. With the emerging of networking operating system for personal computer (PC), such as WindowsNT, it is now feasible to develop distributed computing on a network of PCs. In addition, the computing power delivered by a PC is kept increasing whilst the cost is decreasing. Implying that the performance/cost factor for a PC is high and tile computing power delivered by the network is enormous. In this paper, we describe a software system which enables users to develop distributed computing program using the SPMD (Single Program Multiple Data) paradigm very quickly under the WindowsNT operating system. The programming model for the system is simple and a user can control the system through a graphical interface. The results show that our system provides a reasonable speedup in solving imageprocessing problems.
The use of multiprocessor systems is a well suited solution to handle the problem of implementing realtime imageprocessing applications. We use a multiresolution image decomposition algorithm to show how the A(3) met...
详细信息
ISBN:
(纸本)0819425885
The use of multiprocessor systems is a well suited solution to handle the problem of implementing realtime imageprocessing applications. We use a multiresolution image decomposition algorithm to show how the A(3) methodology (Algorithm Architecture Adequation), and the CAD software SynDEx which support it, may improve the implementation of such algorithms on a multi-DSP architecture. The application algorithm as well as the hardware are specified with graphs, then the implementation may be formalized in terms of graphs transformations. This methodology reduces significantly the development cycle of imageprocessing applications, by simplifying test and debug process.
We present a new parallel volume rendering algorithm based on the split-light model for rendering and the Bulk Synchronous parallel (BSP) model for parallelization. The BSP model provides a simple and architecture-ind...
详细信息
ISBN:
(纸本)0819425885
We present a new parallel volume rendering algorithm based on the split-light model for rendering and the Bulk Synchronous parallel (BSP) model for parallelization. The BSP model provides a simple and architecture-independent approach to structure the parallel program. This parallel program has been tested on a shared memory SGI PowerChallenge machine, a distributed memory IBM SP2 machine and a network of UNIX workstations.
This article presents a new generation in parallelprocessing architecture for real-time imageprocessing. The approach is implemented in a real time image processor chip, called the Xium(TM)-2, based on combining a f...
详细信息
ISBN:
(纸本)0819425885
This article presents a new generation in parallelprocessing architecture for real-time imageprocessing. The approach is implemented in a real time image processor chip, called the Xium(TM)-2, based on combining a fully associative array which provides the parallel engine with a serial RISC core on the same die. The architecture is fully programmable and can be programmed to implement a wide range of color imageprocessing, computer vision and media processing functions in real time. The associative part of the chip is based on patented pending methodology of Associative Computing Ltd. (ACL), which condenses 2048 associative processors, each of 128 ''intelligent'' bits. Each bit can be a processing bit or st memory bit. At only 33 Mhz and 0.6 micron manufacturing technology process, the chip has It computational power of 3 Billion ALU operations per second and 66 Billion string search operations per second. The fully programmable nature of the Xium(TM)-2 chip enables developers to use ACL tools to write their own proprietary algorithms combined with existing imageprocessing and analysis functions from ACL's extended set of libraries.
Experiments in parallelizing an edge detection algorithm on three representative message-passing architectures-a low-cost, heterogeneous PVM network, an Intel iPSC/860 hypercube, and a CM-5 massively parallel multicom...
详细信息
Experiments in parallelizing an edge detection algorithm on three representative message-passing architectures-a low-cost, heterogeneous PVM network, an Intel iPSC/860 hypercube, and a CM-5 massively parallel multicomputer-provide insight into implementation and performance issues for image-processing applications.
parallelization of image analysis tasks forms a basic key for processing huge image data in real time. At this, suitable subtasks for parallelprocessing have to be extracted and mapped to components of a distributed ...
详细信息
This work analyzes the computation distribution in applications generated by a multilevel knowledge-based system for imageprocessing called SVEX.(1) This distribution has been carried out on a heterogeneous workstati...
详细信息
ISBN:
(纸本)0819425885
This work analyzes the computation distribution in applications generated by a multilevel knowledge-based system for imageprocessing called SVEX.(1) This distribution has been carried out on a heterogeneous workstation network, trying to take advantage of the availability and frequent infra-utilization of this computational resource. The parallelization is based on message-passing tool parallel Virtual Machine (PVM).(2) Firstly SVEX and its computational scheme are described, detailing the structure of the first level (the Pixel Processor). Then different distribution paradigms are studied, selecting for its implementation the parallelism based on the data. Considering this alternative, the research addresses two fundamental problems: analysis of basic load-balancing schemes and obtaining-a model for predicting parallelization behavior as new machines are added to the computational network. The results produced in a series of experiments permit the comparison of load-balancing schemes and the validation of the proposed model. The experiments include the processing of both static images and sequences.
The communication overhead in many multiprocessor computing platform is a critical factor over performance. In this paper we will present communication performance of a large processing array built with TI 320C40 DSPs...
详细信息
ISBN:
(纸本)0819425885
The communication overhead in many multiprocessor computing platform is a critical factor over performance. In this paper we will present communication performance of a large processing array built with TI 320C40 DSPs. Inter-processor communication is provided by message passing which is a common method used in multiprocessors systems. The system is developed for imageprocessing therefore transmission of large data blocks and various forms of communication are required frequently. The processor used in this system has six built in communication links. They are X-bit, bi-directional links with a speed of 20 Mbytes/sec. A processing array built with these processors employs MIMD paradigm and static interconnection. In this paper, the communication performance of such DSP network is investigated and performance results are presented. The communication functions include broadcasting, scattering, gathering and point to point transmission of messages.
Two-dimensional (2D) Discrete Fourier Transform (DFT) frequently needs to be performed in the digital imageprocessing. Although the computing time of 2D DFT can be dramatically reduced by using 2D Fast Fourier Transf...
详细信息
ISBN:
(纸本)0819425885
Two-dimensional (2D) Discrete Fourier Transform (DFT) frequently needs to be performed in the digital imageprocessing. Although the computing time of 2D DFT can be dramatically reduced by using 2D Fast Fourier Transform (FFT), the processing speed of a very large array is yet intolerable. The development of parallelprocessing system promotes the application of 2D FFT. In this paper, we present the implementation of 2D FFT as a general procedure by row-column method and vector-radix method based on a general-purpose massively parallelprocessing system-DAWN 1000 developed in China. Even though the 2D FFT has parallel characteristics in nature, the requirement of corner-turning and the existence of data communication make its implementation more complicated. We analyze the impact of the machine capacity and the computing complexity on the algorithm efficiency and evaluate the implementation in terms of the arithmetic operations as well as the data transfer. The comparison of the two methods shows the fact that each method has its own advantages and disadvantages. Combining their traits, we design a new implementation algorithm concerning its flexibility, the efficiency and the complexity of the communication. As an example, we fulfill the spaceborne SAR imageprocessing by using the new approach.
Visual media processing is becoming increasingly important because of the wide variety of image and video based applications. Block rotation is an important operation in different image/video processing tasks such as ...
详细信息
ISBN:
(纸本)0819425885
Visual media processing is becoming increasingly important because of the wide variety of image and video based applications. Block rotation is an important operation in different image/video processing tasks such as graphics, fractal processing, pattern matching and image registration. Remote sensing, medical imaging, computer vision, computer graphics, and video coding are typical applications of digital image rotation. However, a hardware implementation of the block rotation algorithm has not been realized and software implementation is slow. Hence, they are not suitable for real-time execution. In this paper, we propose a novel method for block rotation, which is fast and suitable for hardware implementation. The algorithm employs area based interpolation. Experimental results have shown the performance enhancement compared to classical interpolation algorithms at a similar level of complexity.
暂无评论