the implementation of a parallel algorithm for estimating non-rigid motion vectors using a semi-fluid motion model applied to time-varying satellite imagery is described. Deformable motion tracking of non-rigid biolog...
详细信息
the implementation of a parallel algorithm for estimating non-rigid motion vectors using a semi-fluid motion model applied to time-varying satellite imagery is described. Deformable motion tracking of non-rigid biological objects and remotely sensed objects such as clouds, atmospheric aerosols and gases, polar sea ice, or ocean currents are important application domains for the Semi-fluid Motion Analysis (SMA) algorithm. the focus of this paper is on the parallelization of the SMA algorithm for the MasPar MP-2 architecture. Implementation issues that were evaluated in order to make it feasible to explore dense semi-fluid motion estimates of rapid-scan time-varying geostationary satellite imagery of clouds and weather patterns are described. Cloud motion vectors from the SMA algorithm can be used to estimate the wind field that would be useful in a variety of meteorological applications. Comparisons between the parallel and sequential implementations of the SMA algorithm, and with manual results are briefly discussed.
the distance transform (DT) and the medial axis transform (MAT) are two image computation tools used to extract the information about the shape and the position of the foreground pixels relative to each other. Extensi...
详细信息
the distance transform (DT) and the medial axis transform (MAT) are two image computation tools used to extract the information about the shape and the position of the foreground pixels relative to each other. Extensively applications of these two transforms are used in the fields of computer vision and image processing, such as expanding shrinking, thinning and computing shape factor, etc. there are many different distance transforms based on different distance metrics. Finding the distance transform with respect to the Euclidean distance metric is better in using, but rather time consuming. So, many approximate Euclidean distance transform (EDT) are also widely used in the computer vision and image processing fields. the chessboard distance transform (CDT) is one kind of DT which converts an image based on the chessboard distance metrics. Traditionally, the MAT and the CDT were usually viewed as two completely different image computation problems. In this paper, we first point out that the processes to find the CDT and the MAT are almost identical. that is, two transforms are interchangeable through the proposed algorithms;a MAT can be found by utilizing an CDT algorithm and vice versa.
A new VLSI architecture for real-time pipeline FFT processor is proposed. A hardware oriented radix-22 algorithm is derived by integrating a twiddle factor decomposition technique in the divide and conquer approach. R...
详细信息
A new VLSI architecture for real-time pipeline FFT processor is proposed. A hardware oriented radix-22 algorithm is derived by integrating a twiddle factor decomposition technique in the divide and conquer approach. Radix-22 algorithm has the same multiplicative complexity as radix-4 algorithm, but retains the butterfly structure of radix-2 algorithm. the single-path delay-feedback architecture is used to exploit the spatial regularity in signal flow graph of the algorithm. For length-N DFT computation, the hardware requirement of the proposed architecture is minimal on both dominant components: log4N - 1 complex multipliers and N - 1 complex data memory. the validity and efficiency of the architecture have been verified by simulation in hardware description language VHDL.
A VLSI algorithm for solving a special block-five-diagonal system of linear algebraic equations will be presented. the algorithm is considered for the VLSI parallel computational model where boththe time of the algor...
详细信息
A VLSI algorithm for solving a special block-five-diagonal system of linear algebraic equations will be presented. the algorithm is considered for the VLSI parallel computational model where boththe time of the algorithm and the area of its design are components of the complexity estimations. the linear system arises from the finite-difference approximation of the first bi-harmonic boundary value problem. the algorithm computes the solution by a direct method based on the Woodbury's formula. For the problem on an n × n grid, the VLSI algorithm needs an area A = O(n2log2n) and the time T = O(nlogn). the global AT2-complexity of this method is AT2 = O(n4log4n). this result represents the best upper bound for solving this problem in VLSI. Moreover, this algorithmic design could serve as a preliminary step towards the analysis and development of more detailed structures of specialized VLSI computer devices for solving the biharmonic problem.
Heterogeneous network computing allows the development of a single complex application using a distributed network of machines; these machines may differ in terms of CPU and memory capacity and/or architecture and spe...
详细信息
Heterogeneous network computing allows the development of a single complex application using a distributed network of machines; these machines may differ in terms of CPU and memory capacity and/or architecture and specialized functions. We present a modeling technique, based on generalized stochastic Petri nets (GSPNs), for the performance analysis of applications targeted to this class of systems (heterogeneous applications). We illustrate the use of the proposed technique by modeling and analyzing the CASA 3D-REACT heterogeneous application.
the proceedings contain 35 papers. the special focus in this conference is on Design and Implementation of Symbolic Computation Systems. the topics include: Problem-oriented applications of automated theorem proving;a...
ISBN:
(纸本)3540616977
the proceedings contain 35 papers. the special focus in this conference is on Design and Implementation of Symbolic Computation Systems. the topics include: Problem-oriented applications of automated theorem proving;a strongly-typed embeddable computer algebra library;a general framework for implementing calculi and strategies;equality elimination for the tableau method;towards lean proof checking;high performance equational theorem proving;a reflective language based on conditional term rewriting;term rewriting systems;generative geometric modeling in a functional environment;exploiting SML for experimenting with algebraic algorithms;conditional categories and domains;parameterizing object specifications;analyzing the dynamics of a Z specification;integer and rational arithmetic on maspar;parallel 3-primes FFT algorithm;a master-slave approach to parallel term rewriting on a hierarchical multiprocessor;concepts and applications;document-centered presentation of computing software;animating a non-executable formal specification with a distributed symbolic language;uniform representation of basic algebraic structures in computer algebra;integrating computer algebra with proof planning;structures for symbolic mathematical reasoning and computation;an approach to class reasoning in symbolic computation;an intelligent interface to numerical routines;computer algebra and the world wide web;software architectures for computer algebra;a deductive database for mathematical formulas;a system for computer aided constructive algebraic geometry;making systems communicate and cooperate;a database for number fields;compiling residuation for a multiparadigm symbolic programming language and pluggability issues in the multi protocol.
the recent accelerated development of scalable computing systems has made possible the coordinated use of a suite of High Performance Computing (HPC) components for computationally demanding problems in embedded appli...
详细信息
the recent accelerated development of scalable computing systems has made possible the coordinated use of a suite of High Performance Computing (HPC) components for computationally demanding problems in embedded applications. these emerging Scalable Heterogeneous High Performance Embedded (SHHiPE) systems are designed using commercial off the shelf (COTS) modules. Our current interest is to employ these platforms to solve variety of problems in real time signal processing. Large performance gains can be realized by exploiting knowledge of the computational structure of an algorithm through data remapping. We present the motivation for a portable programming paradigm that captures key features of a SHHiPE platform. the Message Passing Interface (MPI) standard is proposed as a basis for development of this paradigm. An application in sonar is used to illustrate typical communication requirements in SHHiPE systems.
Discrete-event simulation is an important tool used for the performance evaluation of parallel systems. the space of tradeoffs is large however, when attempting to balance model fidelity and simulation execution tame....
详细信息
Discrete-event simulation is an important tool used for the performance evaluation of parallel systems. the space of tradeoffs is large however, when attempting to balance model fidelity and simulation execution tame. the paper describes a simulator-TAPS (threaded Application parallel System Simulator)-that, in the context of threaded parallel computations, provides a spectrum of possibilities in this tradeoff space. TAPS is specifically designed to be parallelized; we discuss some crucial considerations regarding its parallelization.
Homing sequences play an important role in the testing of finite state systems and have been used in a number of applications such as hardware fault detection, protocol verification, and learning algorithms etc. Recen...
详细信息
Homing sequences play an important role in the testing of finite state systems and have been used in a number of applications such as hardware fault detection, protocol verification, and learning algorithms etc. Recent applications of homing sequences involve large DFAs withthousands of states. Such applications motivate the design of a parallel algorithm for this problem. the author present a deterministic parallel algorithm of time complexity O(/spl radic/nlog/sup 2/n) using a polynomial number of processors on the CREW PRAM model. No faster deterministic parallel algorithm is known for this problem. the author also discusses the parallel complexity of some related problems.
暂无评论