One major difficulty in designing an architecture for the parallel implementation of Discrete Wavelet Transform (DWT) is that the DWT is not a block transform. As a result, frequent communication has to be set up betw...
详细信息
ISBN:
(纸本)0819433039
One major difficulty in designing an architecture for the parallel implementation of Discrete Wavelet Transform (DWT) is that the DWT is not a block transform. As a result, frequent communication has to be set up between processors to exchange data so that correct boundary wavelet coefficients can be computed. The significant communication overhead thus hampers the improvement of the efficiency of parallel systems, specially for processor networks with large communication latencies. In this paper we propose a new technique, called Boundary Postprocessing, that allows the correct transform of boundary samples. The basic idea is to model the DWT as a Finite State Machine (FSM) based on the lifting factorization of the wavelet filterbanks. Application of this technique leads to a new parallel DWT architecture, Sg,lit-and-Merge, which requires data to be communicated only once between neighboring processors for any arbitrary level of wavelet decompositions. Example designs and performance analysis for 1D and 2D DWT show that the proposed technique can greatly reduce the interprocessor communication overhead. As an example, in a two-processor case our proposed approach shows an average speedup of about 30% as compared to best currently available parallel computation.
We describe the design and implementation of our StratOSphere project, a framework which unifies distributed objects and mobile code applications. We begin by first examining different mobile code paradigms that distr...
详细信息
We describe the design and implementation of our StratOSphere project, a framework which unifies distributed objects and mobile code applications. We begin by first examining different mobile code paradigms that distribute processing of code and data resource components across a network. After analyzing these paradigms, and presenting a lattice of functionality, we then develop a layered architecture for StratOSphere, incorporating higher levels of mobility and interoperability at each successive layer. In our design, we provide an object model that permits objects to migrate to different sites, select among different method implementations, and provide new methods and behavior. We describe how we build new semantics in each software layer, and present sample objects developed for the Alexandria Digital Library Project at UC Santa Barbara, which as been building an information retrieval system for geographically-referenced information and datasets. We have designed using StratOSphere a repository that stores its holdings. The library's map, image and geographical data are viewed as a collection of objects with extensible operations. StratOSphere.
We present a tutorial description of the CAP Computer-Aided parallelization tool. CAP has been designed with the goal of letting the parallel application programmer having the complete control about how his applicatio...
详细信息
We present a tutorial description of the CAP Computer-Aided parallelization tool. CAP has been designed with the goal of letting the parallel application programmer having the complete control about how his application is parallelized, and at the same time freeing him from the burden of managing explicitly a large number of threads and associated synchronization and communication primitives. The CAP tool, a precompiler generating C++ source code, enables application programmers to specify at a high level of abstraction the set of threads present in the application, the processing operations offered by these threads, and the parallel constructs specifying the flow of data and parameters between operations. A configuration map specifies the mapping between CAP threads and operating system processes, possibly located on different computers. The generated program may run on various parallel configurations without recompilation. We discuss the issues of flow control and load balancing and show the solutions offered by CAP. We also show how CAP can be used to generate relatively complex parallel programs incorporating neighbourhood dependent operations. Finally, we briefly describe a real 3D imageprocessing application: the Visible Human Slice Server (http://***), its implementation according to the previously defined concepts and its performances.
Moments are one of the most well known feature descriptors which can be extracted from an image;their mathematical properties and versatility as feature extractors are well studied. This paper presents a design of mom...
详细信息
Moments are one of the most well known feature descriptors which can be extracted from an image;their mathematical properties and versatility as feature extractors are well studied. This paper presents a design of moment generators, using established techniques in digital filters and Very Scale Integration (VLSI) processing combined under a component-based design framework. Analytically, the moment generator architecture is constructed by cascading single-pole stages of a relatively simple filter suitable for implementation on an ASIC platform, and which is capable of producing a linear combination of moments. Individual set of moments can be extracted, by using dematrixing techniques which could also be realised in the form of a pre-programmable logic table. A parallel implementation of the design is described using C*, a data-parallel extension of ANSIC. Preliminary evaluation of the design and implementation is also presented.
An image cytometric method for quantifying integrated fluorescence was developed to measure the relative DNA contents of bacterial nucleoids. image analysis was performed with newly developed macros in combination wit...
详细信息
An image cytometric method for quantifying integrated fluorescence was developed to measure the relative DNA contents of bacterial nucleoids. image analysis was performed with newly developed macros in combination with the program Object-image, all downloadable from http://***/***. Four aspects of the method were investigated. (i) Good linearity was found over a ten-fold range of fluorescence intensity in a test with a calibration kit of fluorescent latex spheres, (ii) The accuracy of the method was tested with a narrowly distributed Escherichia coli population, which was obtained by growing cells into stationary phase, The width of the image cytometric distribution was approximately 6%, in good agreement with results obtained by now cytometry. (iii) The error contribution of manual focusing could be kept below 2%, although a strong dependency between integrated fluorescence and focus position was observed. (iv) The results were verified with a now cytometer, which gave similar distributions for the DNA contents per cell expressed in chromosome equivalents (4.8 fg of DNA). We used the presented method to evaluate whether the DNA conformation had any effect on the total fluorescence of bacterial nucleoids. Experiments using nucleoids with the same amount of DNA in either a dispersed or a compact conformation showed no significant difference in integrated fluorescence, indicating that it is possible to determine the DNA content per nucleoid independently of the actual organization of the DNA.
By mapping computations directly onto hardware, reconfigurable machines promise a tremendous speed-up over traditional computers. However, executing floating-point operations directly in hardware is a waste of resourc...
详细信息
By mapping computations directly onto hardware, reconfigurable machines promise a tremendous speed-up over traditional computers. However, executing floating-point operations directly in hardware is a waste of resources. Variable precision fixed-point arithmetic operations can save gates and reduce clock cycle times. This paper investigates the relation between precision and error for image compression/decompression. More precisely, this paper investigates the relationship between error and bit-precision for the Discrete Cosine Transform (DCT) and JPEG. The present work is part of the Cameron project at the Computer Science Department of Colorado State University. This project is roughly divided in three areas: an C-like parallel language called SA-C that is targeted for imageprocessing on reconfigurable computers, an implementation of the VSIP library for imageprocessing in SA-C, and an optimizing compiler for SA-C that targets FPGAs.
An adaptive neighborhood contrast enhancement (ANCE) technique was developed to improve the perceptibility of features in digitized mammographic images for use in breast cancer screening. The computationally intensive...
详细信息
An adaptive neighborhood contrast enhancement (ANCE) technique was developed to improve the perceptibility of features in digitized mammographic images for use in breast cancer screening. The computationally intensive algorithm was implemented on a cluster of 30 DEC Alpha processors using the message passing interface (MPI). The parallel implementation of the ANCE technique utilizes histogram-based image partitioning with each partition consisting of pixels of the same gray-level value regardless of their location in the image. The master processor allots one set of pixels to each slave processor. The slave returns the results to the master, and the master then sends a new set of pixels to the slave for processing. This procedure continues until there are no sets of pixels left. The subdivision of the original image based on gray-level values guarantees that slave processors do not process the same pixel, and is specifically well-suited to the characteristics of the ANCE algorithm. The parallelism value of the problem is approximately 16, i.e., the performance does not improve significantly when more than 16 processors are used. The result is a substantial improvement in processing time, leading to the enhancement of 4K × 4K pixel images in the range of 20 to 60 seconds.
In this paper, we present a new intelligent agent-based method to design filter banks that maximize compression quality. In this method, a multi-agent system containing cooperating intelligent agents with different ro...
详细信息
In this paper, we present a new intelligent agent-based method to design filter banks that maximize compression quality. In this method, a multi-agent system containing cooperating intelligent agents with different roles is developed to search for filter banks that improve image compression quality. The multi-agent system consists of one generalization agent, and several problem formulation, optimization, and compression agents. The generalization agent performs problem decomposition and result collection. It distributes optimization tasks to optimization agents, and later collects results and selects one solution that works well on all training images as the final output. Problem formulation agents build optimization models that are used by the optimization agents. The optimization formulation includes both the overall performance of image compression and metrics of individual filters. The compression performance is provided by the image coding agent. Optimization agents apply various optimization methods to find the best filter bank for individual training images. Our method is modular and flexible, and is suitable for distributedprocessing. In experiments, we applied the proposed method to a set of benchmark images and designed filter banks that improve the compression performance of existing filter banks.
The proceedings contain 69 papers. The special focus in this conference is on parallel Numerics, parallel Computing in imageprocessing, Video processing, and Multimedia. The topics include: Non-standard parallel solu...
ISBN:
(纸本)3540656413
The proceedings contain 69 papers. The special focus in this conference is on parallel Numerics, parallel Computing in imageprocessing, Video processing, and Multimedia. The topics include: Non-standard parallel solution strategies for distributed sparse linear systems;optimal tridiagonal solvers on mesh interconnection networks;parallel pivots LU algorithm on the cray T3E;experiments with parallel one-sided and two-sided algorithms for SVD;combined systolic array for matrix portrait computation;a class of explicit two-step runge-kutta methods with enlarged stability regions for parallel computers;a parallel strongly implicit algorithm for solution of diffusion equations;a parallel algorithm for lagrange interpolation on k-ary n-cubes;long range correlations among multiple processors;a monte-carlo method with inherent parallelism for numerical solving partial differential equations with boundary conditions;blocking techniques in numerical software;HPF and numerical libraries;an object library for parallel sparse array computation;performance analysis and derived parallelization strategy for a SCF program at the hartree fock level;computational issues in optimizing ophthalmic lens;parallel finite element modeling of solidification processes;architectural approaches for multimedia processing;on parallel reconfigurable architectures for imageprocessing;parallel multiresolution image segmentation with watershed transformation and solving irregular inter-processor data dependency in image understanding tasks.
Simulations of classical molecular dynamic (MD) systems can be sped up considerably by parallelizing the existing codes for distributed memory machines, In classical MD the CPU time is typically a function of the squa...
详细信息
Simulations of classical molecular dynamic (MD) systems can be sped up considerably by parallelizing the existing codes for distributed memory machines, In classical MD the CPU time is typically a function of the square of the number of atoms, The size of the molecular system which can be solved is therefore often limited by the CPU available. There are different approaches for reducing computation time, One consists in parallelizing sequential O(N-2) algorithms. The other is replacing the calculation of non-bonding forces by a less complex algorithm which can then be parallelized. We have generated a code (MEGADYN) for the simulation of MD of large simulation ensembles (up to 10(6) atoms) on the basis of classical force field methods, A reduction of complexity of the calculation of forces and energy down to O(N) was achieved by employing Greengards fast multipole method (FMM) to the Coulomb interaction. Within the framework of FMM the periodic boundary conditions are realized in a minimum image convention type manner. Thus MEGADYN can be used to simulate NVT as weil as NPT ensembles. (C) 1999 Elsevier Science B.V. All rights reserved.
暂无评论