The paper presents a framework to add data and task parallelism to a sequential imageprocessing library. The library contains 3 modules, one for low-level operators, the second for intermediate-level operators and th...
详细信息
The paper presents a framework to add data and task parallelism to a sequential imageprocessing library. The library contains 3 modules, one for low-level operators, the second for intermediate-level operators and the third for high-level operators. We parallelize the low-level operators by data decomposition and we are working at adding task parallelism at the imageprocessing application level. We validate our data parallel approach by testing it with the geometric mean filter and the multibaseline stereo vision algorithm. Experiments on a cluster of workstations show very good speedup.
This paper studies the application of preconditioned conjugate gradient methods in high resolution color image reconstruction problems. The high resolution color images are reconstructed from multiple undersampled, sh...
ISBN:
(纸本)0769505716
This paper studies the application of preconditioned conjugate gradient methods in high resolution color image reconstruction problems. The high resolution color images are reconstructed from multiple undersampled, shifted, degraded color frames with subpixel displacements. The resulting degradation matrices are spatially variant. The preconditioners are derived by taking the cosine transform approximation of the degradation matrices. The resulting preconditioning matrices allow the use of fast transform methods. We show how the methods can be implemented on parallel computers, and toe demonstrate thee's parallel efficiency using experiments on a sixteen processor IBM SP-2.
The rapidly increasing popularity of the discrete wavelet transform (DWT) as an effective tool in many signal processing and data compression applications, and its integration into JPEG 2000 has given rise to various ...
详细信息
ISBN:
(纸本)0819437638
The rapidly increasing popularity of the discrete wavelet transform (DWT) as an effective tool in many signal processing and data compression applications, and its integration into JPEG 2000 has given rise to various DWT algorithms and their VLSI implementations to reduce complexity and enhance performance, In this paper, we present an efficient hardware implementation of the discrete wavelet transform and its deployment on a reconfigurable FPGA based platform. Our implementation is a novel architecture based on the lifting factorization of the wavelet filter banks. This factorization leads to a block based parallel DWT architecture suitable for hard-ware implementation. To overcome the communication overhead associated with the DWT block transform, we utilize the new Overlap-State(1,2) technique to compute the DWT near block boundaries. A VHDL description of the lifting polyphase factorization architecture was developed and ported to an FPGA hardware platform that was chosen to allow partial and full reconfigurability to accommodate Various applications with different filter banks. Our hardware implementation improves the performance by better than twofold speed up when compared to an efficient pipelined FPGA based implementation.
A difficult problem in automatic medical image understanding is that for every image type such as x-ray and every body organ such as heart, there exist specific solutions that do not allow for generalization. Just col...
详细信息
ISBN:
(纸本)0819437638
A difficult problem in automatic medical image understanding is that for every image type such as x-ray and every body organ such as heart, there exist specific solutions that do not allow for generalization. Just collecting all the specific solutions will not achieve the vision of a computerized physician. To address this problem, we propose an intelligent agent approach that is based on agent-oriented programming and the concept of active fusion. The advantage of agent-oriented programming is that it combines the benefits of object-oriented programming and expert system. For radiology image understanding, we present a multi-agent system that is composed of two major types of intelligent agents: radiologist agents and patient agents. A patient agent asks for multiple opinions from radiologist agents in interpreting a given set of images and then integrates the opinions. A radiologist agent decomposes the image recognition task into smaller problems that are solved collectively by multiple intelligent sub-agents. Finally, we present a preliminary implementation and running examples of the multi-agent system.
Advances in technology have enabled us to collect data from observations, experiments, and simulations at an ever increasing pace. As these data sets approach the terabyte and petabyte range, scientists are increasing...
详细信息
Advances in technology have enabled us to collect data from observations, experiments, and simulations at an ever increasing pace. As these data sets approach the terabyte and petabyte range, scientists are increasingly using semi-automated techniques from data mining and pattern recognition to find useful information in the data. In order for data mining to be successful, the raw data must first be processed into a form suitable for the detection of patterns. When the data is in the form of images, this can involve a substantial amount of processing on very large data sets. To help make this task more efficient, we are designing and implementing an object-oriented imageprocessing toolkit that specifically targets massively-parallel, distributed-memory architectures. We first show that it is possible to use object-oriented technology to effectively address the diverse needs of image applications. Next, we describe how we abstract out the similarities in imageprocessing algorithms to enable re-use in our software. We will also discuss the difficulties encountered in parallelizing image algorithms on massively parallel machines as well as the bottlenecks to high performance. We will demonstrate our work using images from an astronomical data set, and illustrate how techniques such as filters and denoising through the thresholding of wavelet coefficients can be applied when a large image is distributed across several processors.
This paper proposes Virtual Video Tape (WT). It is a randomly accessible motion image recorder in main memory. VVT is realized with only software, not hardware. It is intended as a tool for real-time motion image unde...
详细信息
ISBN:
(纸本)0819437638
This paper proposes Virtual Video Tape (WT). It is a randomly accessible motion image recorder in main memory. VVT is realized with only software, not hardware. It is intended as a tool for real-time motion image understanding research. Recent remarkable progress of PC hardware enables to use gigabyte order main memory. By utilizing such large sized memory, there is a possibility to realize motion image recorder with software. Thus we propose VVT as an example of this kind image recorder. Utilizing current components, recording time fan be expected as minutes order. Since the proposed VVT is fully digital, there is no analog medium nor possibility for degradation of image quality. Since there is no deterioration of playback image and no rewinding. VVT must contribute to program development for motion image understanding. Based upon the proposed idea, the authors have implemented a prototype VVT and used the prototype to develop visual tracking, real-time face detection, and so forth. Through the implementation and experience of the usage, we have confirmed feasibility and effectiveness of the proposed idea. In this paper, the authors discuss background, required functions and structure of the recorder. Some implementation issues are also described.
There are many kinds of so-called irregular expressions in natural dialogues. Even if the content of a conversation is the same in words, different meanings can be interpreted by a person's feeling or face express...
详细信息
There are many kinds of so-called irregular expressions in natural dialogues. Even if the content of a conversation is the same in words, different meanings can be interpreted by a person's feeling or face expression. To have a good understanding of dialogues, it is required in a flexible dialogue processing system to infer the speaker's view properly. However, it is difficult to obtain the meaning of the speaker's sentences in various scenes using traditional methods. In this paper, a new approach for dialogue processing that incorporates information from the speaker's face is presented. We first divide conversation statements into several simple tasks. Second, we process each simple task using an independent processor. Third, we employ some speaker's face information to estimate the view of the speakers to solve ambiguities in dialogues. The approach presented in this paper can work efficiently, because independent processors run in parallel, writing partial results to a shared memory, incorporating partial results at appropriate points, and complementing each other. A parallel algorithm and a method for employing the face information in a dialogue machine translation will be discussed, and some results will be included in this paper.
A parallel implementation of the 2D discrete wavelet transform on a distributed memory multiprocessor system called PARNEU is presented. The mapping has been chosen with consideration to load balancing and communicati...
详细信息
A parallel implementation of the 2D discrete wavelet transform on a distributed memory multiprocessor system called PARNEU is presented. The mapping has been chosen with consideration to load balancing and communication methods in order to achieve the best possible scalability and performance in transforming one single image. Detailed performance figures are included. Experimental results show that significant parallel speedup is reached with this mapping.
We are interested in running in parallel cellular automata. We present an algorithm which explores the dynamic remapping of cells in order to balance the load between the processing nodes. The parallel application run...
详细信息
We are interested in running in parallel cellular automata. We present an algorithm which explores the dynamic remapping of cells in order to balance the load between the processing nodes. The parallel application runs on a cluster of PCs connected by Fast-Ethernet. A general cellular automaton can be described as a set of cells where each cell is a state machine. To compute the next cell state, each cell needs some information from neighbouring cells. There are no limitations on the kind of information exchanged nor on the computation itself. Only the automaton topology defining the neighbours of each cell remains unchanged during the automaton's life. As a typical example of a cellular automaton we consider the image skeletonization problem. Skeletonization requires spatial filtering to be repetitively applied to the image. Each step erodes a thin part of the original image. After the last step, only the image skeleton remains. Skeletonization algorithms require vast amounts of computing power, especially when applied to large images. Therefore, skeletonization application can potentially benefit from the use of parallelprocessing. Two different parallel algorithms are proposed, one with a static load distribution consisting in splitting the cells over several processing nodes and the other with a dynamic load balancing scheme capable of remapping cells during the program execution. Performance measurements shows that the cell migration doesn't reduce the speedup if the program is already load balanced. It greatly improves the performance if the parallel application is not well balanced.
This paper deals with the implementation of a systolic array architecture in hardware using FPGAs for processing compressed binary images without decompressing them. Specifically, run-length encoding (RLE) is used for...
详细信息
This paper deals with the implementation of a systolic array architecture in hardware using FPGAs for processing compressed binary images without decompressing them. Specifically, run-length encoding (RLE) is used for compression. processingimages in compressed form provides a significant speedup in the computation. Using a systolic architecture and implementing it in hardware further increases the speed.
暂无评论