This paper presents efficient and portable implementations of a powerful image enhancement process, the Symmetric Neighborhood Filter (SNF), and an image segmentation technique that makes use of the SNF and a variant ...
详细信息
This paper presents efficient and portable implementations of a powerful image enhancement process, the Symmetric Neighborhood Filter (SNF), and an image segmentation technique that makes use of the SNF and a variant of the conventional connected components algorithm which we call delta-Connected Components. We use efficient techniques for distributing and coalescing data as well as efficient combinations of task and data parallelism. The image segmentation algorithm makes use of an efficient connected components algorithm based on a novel approach for parallel merging. The algorithms have been coded in SPLIT-C and run on a variety of platforms, including the Thinking Machines CM-5, IBM SP-1 and SP-2, Gray Research T3D, Meiko Scientific CS-2, Intel Paragon, and workstation clusters. Our experimental results are consistent with the theoretical analysis (and provide the best known execution times for segmentation, even when compared with machine-specific implementations). Our test data include difficult images from the Landsat Thematic Mapper (TM) satellite data.
This correspondence presents several parallel algorithms for image template matching on an SIMD array processor with a hypercube interconnection network. For an N by N image and an M by M window, the time complexity i...
详细信息
This correspondence presents several parallel algorithms for image template matching on an SIMD array processor with a hypercube interconnection network. For an N by N image and an M by M window, the time complexity is reduced from O(N2M2) for the serial algorithm to O(M2/K2 + M * log2 N/K + log2 N * log2 K) for the N2K2-PE system (1 ≤ K ≤ M), or to O(N2M2/L2) for the L2-PE system (L < N). With efficient use of the inter-PE communication network, each PE requires only a small local memory, many unnecessary data transmissions are eliminated, and the time complexity is greatly reduced.","doi":"10.1109/TPAMI.1987.4767990","publicationTitle":"IEEE Transactions on Pattern Analysis and Machine Intelligence","startPage":"835","endPage":"841","rightsLink":"http://***/AppDispatchServlet?publisherName=ieee&publication=0162-8828&title=parallel+algorithms+for+Image+Template+Matching+on+Hypercube+SIMD+Computers&isbn=&publicationDate=Nov.+1987&author=Zhixi+Fang&ContentID=10.1109/TPAMI.1987.4767990&orderBeanReset=true&startPage=835&endPage=841&volumeNum=PAMI-9&issueNum=6","displayPublicationTitle":"IEEE Transactions on Pattern Analysis and Machine Intelligence","pdfPath":"/iel5/34/4767975/***","keywords":[{"type":"IEEE Keywords","kwd":["parallel algorithms","Hypercubes","Concurrent computing","Multiprocessor interconnection networks","Computer science","Computer networks","Mesh networks","Communication networks","Data communication","Reflective binary codes"]},{"type":"Author Keywords ","kwd":["Complexity","GRAY code","hypercube interconnection network","parallel algorithm","parallel recursive procedure","SIMD computer","template matching"]}],"allowComments":false,"pubLink":"/xpl/***?punumber=34","issueLink":"/xpl/***?isnumber=4767975","standardTitle":"parallel algorithms for Image Template Matching on Hypercube SIMD Computers
We present an efficient O(n) numerical algorithm for first-order approximation of geodesic distances on geometry images, where n is the number of points on the surface. The structure of our algorithm allows efficient ...
详细信息
We present an efficient O(n) numerical algorithm for first-order approximation of geodesic distances on geometry images, where n is the number of points on the surface. The structure of our algorithm allows efficient implementation on parallel architectures. Two implementations on a SIMD processor and on a GPU are discussed. Numerical results demonstrate up to four orders of magnitude improvement in execution time compared to the state-of-the-art algorithms.
Various parallel implementations of algorithms for the QR decomposition of a matrix are compared using shared memory multiprocessors. algorithms based on both Givens and Householder transformations are considered. A n...
详细信息
Various parallel implementations of algorithms for the QR decomposition of a matrix are compared using shared memory multiprocessors. algorithms based on both Givens and Householder transformations are considered. A number of parallelisation techniques are used with particular emphasis on algorithms which allocate work to tasks dynamically. The results indicate that one version is significantly better than the others.
作者:
MEHRNOOSH, MGrumman Data Syst.
San Diego CA USA Abstract Authors References Cited By Keywords Metrics Similar Download Citation Email Print Request Permissions
In this paper, we study a class of VLSI organizations with optical interconnects for fast solutions to several image processing tasks. The organization and operation of these architectures are based on a generic model...
详细信息
In this paper, we study a class of VLSI organizations with optical interconnects for fast solutions to several image processing tasks. The organization and operation of these architectures are based on a generic model called OMC, which is proposed to understand the computational limits in using free space optics in VLSI parallel processing systems. The relationships between OMC and shared memory models are discussed in this paper. Also, three physical implementations of OMC are presented. Using OMC, we present several parallel algorithms for fine grain image computing. We categorize our results in the following order. First, we present a set of processor efficient optimal O(log N) algorithms and a set of constant time algorithms for finding geometric properties of digitized images. Finally, we focus on special purpose designs tailored to meet both the computation and communication needs of problems such as those involving irregular sparse matrices.
Three commonly used traversal methods for binary trees (forsets) are pre-order, in-order and post-order. It is well known that sequential algorithms for these traversals takes order O( N ) time where N is the total nu...
详细信息
Three commonly used traversal methods for binary trees (forsets) are pre-order, in-order and post-order. It is well known that sequential algorithms for these traversals takes order O( N ) time where N is the total number of nodes. This paper establishes a one-to-one correspondence between the set of nodes that possess right sibling and the set of leaf nodes for any forest. For the case of pre-order traversal, this result is shown to provide an alternate characterization that leads to a simple and elegant parallel algorithm of time complexity O(log N ) with or without read-conflicts on an N processor SIMD shared memory model, where N is the total number of nodes in a forest.
A maximum a posteriori (MAP) algorithm is presented for the estimation of spin-density and spin-spin decay distributions from frequency and phase-encoded magnetic resonance imaging data. Linear spatial localization gr...
详细信息
A maximum a posteriori (MAP) algorithm is presented for the estimation of spin-density and spin-spin decay distributions from frequency and phase-encoded magnetic resonance imaging data. Linear spatial localization gradients are assumed: the y-encode gradient applied during the phase preparation time of duration tau before measurement collection, and the x-encode gradient applied during the full data collection time t greater than or equal to 0, The MRT signal model developed in [22] is used in which a signal resulting from M phase encodes (rows) and N frequency encode dimensions (columns) is modeled as a superposition of MN sine-modulated exponentially decaying sinusoids with unknown spin-density and spin-spin decay parameters, The nonlinear least-squares MAP estimate of the spin density and spin-spin decay distributions solves for the 2MN spin-density and decay parameters minimizing the squared-error between the measured data and the sine-modulated exponentially decay signal model using an iterative expectation-maximization algorithm. A covariance diagonalizing transformation is derived which decouples the joint estimation of MN sinusoids into M separate N sinusoid optimizations, yielding an order of magnitude speed up in convergence, The MAP solutions are demonstrated to deliver a decrease in standard deviation of image parameter estimates on brain phantom data of greater than a factor of two over Fourier-based estimators of the spin density and spin-spin decay distributions. A parallel processor implementation is demonstrated which maps the N sinusoid coupled minimization to separate individual simple minimizations, one for each processor.
In this paper four parallel algorithms for the evaluation of finite series of orthogonal polynomials are introduced. The algorithms are based on the Forsythe and Clenshaw sequential algorithms. Several tests carried o...
详细信息
In this paper four parallel algorithms for the evaluation of finite series of orthogonal polynomials are introduced. The algorithms are based on the Forsythe and Clenshaw sequential algorithms. Several tests carried out on a Cray T3D are presented.
Computational methods based on the use of adaptively constructed nonuniform meshes reduce the amount of computation and storage necessary to perform many scientific calculations. The adaptive construction of such nonu...
详细信息
Computational methods based on the use of adaptively constructed nonuniform meshes reduce the amount of computation and storage necessary to perform many scientific calculations. The adaptive construction of such nonuniform meshes is an important part of these methods. In this paper, we present a parallel algorithm for adaptive mesh refinement that is suitable for implementation on distributed-memory parallel computers. Experimental results obtained on the Intel DELTA are presented to demonstrate that for scientific computations involving the finite element method, the algorithm exhibits scalable performance and has a small run time in comparison with other aspects of the scientific computations examined. It is also shown that the algorithm has a fast expected running time under the parallel random access machine (PRAM) computation model.
With the development of roof photovoltaic power (PV) generation technology and the increasingly urgent need to improve supply reliability levels in remote areas, islanded microgrid with photovoltaic and energy storage...
详细信息
With the development of roof photovoltaic power (PV) generation technology and the increasingly urgent need to improve supply reliability levels in remote areas, islanded microgrid with photovoltaic and energy storage systems (IMPE) is developing rapidly. The high costs of photovoltaic panel material and energy storage battery material have become the primary factors that hinder the development of IMPE. The advantages and disadvantages of different types of photovoltaic panel materials and energy storage battery materials are analyzed in this paper, and guidance is provided on material selection for IMPE planners. The time sequential simulation method is applied to optimize material demands of the IMPE. The model is solved by parallel algorithms that are provided by a commercial solver named CPLEX. Finally, to verify the model, an actual IMPE is selected as a case system. Simulation results on the case system indicate that the optimization model and corresponding algorithm is feasible. Guidance for material selection and quantity demand for IMPEs in remote areas is provided by this method. (C) 2016 Elsevier B.V. All rights reserved.
暂无评论