A parallel decompositional algorithm and VLSI architecture is proposed for computation of the output of a stack filter over a single window of input samples using Fibonacci p-codes. For a subclass of PBF's, a more...
详细信息
A parallel decompositional algorithm and VLSI architecture is proposed for computation of the output of a stack filter over a single window of input samples using Fibonacci p-codes. For a subclass of PBF's, a more efficient parallel algorithm and VLSI architecture for running stack filtering is also presented. The area-time complexities of the proposed designs are estimated.
The goal of this paper is to develop the parallel algorithms that, on input of a learning sample, identify a regular language by means of a nondeterministic finite automaton (NFA). A sample is a pair of finite sets co...
详细信息
The goal of this paper is to develop the parallel algorithms that, on input of a learning sample, identify a regular language by means of a nondeterministic finite automaton (NFA). A sample is a pair of finite sets containing positive and negative examples. Given a sample, a minimal NFA that represents the target regular language is sought. We define the task of finding an NFA, which accepts all positive examples and rejects all negative ones, as a constraint satisfaction problem, and then propose the parallel algorithms to solve the problem. The results of comprehensive computational experiments on the variety of inference tasks are reported. The question of minimizing an NFA consistent with a learning sample is computationally hard.
parallel algorithms on SIMD (single-instruction stream multiple-data stream) machines for hierarchical clustering and cluster validity computation are proposed. The machine model uses a parallel memory system and an a...
详细信息
parallel algorithms on SIMD (single-instruction stream multiple-data stream) machines for hierarchical clustering and cluster validity computation are proposed. The machine model uses a parallel memory system and an alignment network to facilitate parallel access to both pattern matrix and proximity matrix. For a problem with N patterns, the number of memory accesses is reduced from O(N/sup 3/) on a sequential machine to O(N/sup 2/) on an SIMD machine with N PEs.
The maximal linear forest problem is to find, given a graph G = (V, E), a maximal subset of V that induces a linear forest. Three parallel algorithms for this problem are presented. The first one is randomized and run...
详细信息
The maximal linear forest problem is to find, given a graph G = (V, E), a maximal subset of V that induces a linear forest. Three parallel algorithms for this problem are presented. The first one is randomized and runs in O(log n) expected time using n(2) processors on a CRCW PRAM. The second one is deterministic and runs in O(log(2) n) time using n(4) processors on an EREW PRAM. The last one is deterministic and runs in O(log(5) n) time using n(3) processors on an EREW PRAM. The results put the problem in the class NC.
We address a geometric problem called the segment dragging. We have n "obstacles" in the plane and want to preprocess them so that, given a query vertical line segment s, intersecting no obstacles, the first...
详细信息
We address a geometric problem called the segment dragging. We have n "obstacles" in the plane and want to preprocess them so that, given a query vertical line segment s, intersecting no obstacles, the first obstacle hit by s when we drag s horizontally to the right can be found efficiently. We present an O(log n) time, O(n) processor parallel algorithm for preprocessing when the obstacles are points or nonintersecting line segments. After preprocessing, a query can be answered in O(log n) time using a single processor. Our model of parallel computation is the EREW PRAM.
For solving systems of linear algebraic equations with block-tridiagonal matrices arising in geoelectrics problems, the parallel matrix sweep algorithm, conjugate gradient method with preconditioner, and square root m...
详细信息
For solving systems of linear algebraic equations with block-tridiagonal matrices arising in geoelectrics problems, the parallel matrix sweep algorithm, conjugate gradient method with preconditioner, and square root method are proposed and implemented numerically on multi-core CPU Intel with graphics processors NVIDIA. Investigation of efficiency and optimization of parallel algorithms for solving the problem with quasi-model data are performed. Crown Copyright (C) 2012 Published by Elsevier B.V. All rights reserved.
Modelling of salt transfer processes in fractal structured media has been considered on the base of fractional derivative equations with Caputo-Gerasimov derivatives with respect to space variables. Initial-boundary p...
详细信息
Modelling of salt transfer processes in fractal structured media has been considered on the base of fractional derivative equations with Caputo-Gerasimov derivatives with respect to space variables. Initial-boundary problem has been solved using locally one-dimensional finite difference scheme. Procedure of fractional derivative approximation has been proposed to lower computational complexity of solution process. parallel algorithms for distributed memory systems and GPU have been considered. Analysis of using one-dimensional and red-black data partitioning schemes is presented and new parametric scheme which have better characteristics in the determined conditions has been proposed.
作者:
George, ADKim, KUniv Florida
Dept Elect & Comp Engn High Performance Comp & Simulat HCS Res Lab Gainesville FL 32611 USA
Quiet submarine threats and high clutter in the littoral,undersea:environment increase the processing demands on beamforming arrays, particularly for applications which require in-array autonomous operation. Whereas t...
详细信息
Quiet submarine threats and high clutter in the littoral,undersea:environment increase the processing demands on beamforming arrays, particularly for applications which require in-array autonomous operation. Whereas traditional single-aperture beamforming approaches may falter, the Split-Aperture Conventional Beamforming (SA-CBF) algorithm can be used to meet stringent requirements for more precise bearing estimation. Moreover, by coupling each transducer node with a microprocessor, parallel processing of the split-aperture beamformer on a distributed system can glean advantages in execution speed, fault tolerance, scalability, and cost. In this paper, parallel algorithms for SA-CBF are introduced using coarse-grained and medium-grained forms of decomposition. Performance results from parallel and sequential algorithms are presented using a distributed system testbed comprised of a cluster of workstations connected by a high-speed network. The execution times, parallel efficiencies, and memory requirements of each parallel algorithm are presented and analyzed. The results of these analyses demonstrate that parallel in-array processing holds the potential to meet the needs of future advanced sonar beamforming algorithms in a scalable fashion.
Many approaches have been proposed for deriving tests from finite state machine (FSM) specifications with respect to some established coverage criteria. A fundamental core problem in FSM-based testing relates to the d...
详细信息
Many approaches have been proposed for deriving tests from finite state machine (FSM) specifications with respect to some established coverage criteria. A fundamental core problem in FSM-based testing relates to the derivation of input sequences that can distinguish states of an FSM specification, aka distinguishing sequences. A major effort in the construction of these sequences is based on the derivation of a successors search-tree labeled by sets of pairs of states of the given machine. We aim at reducing the time associated with such constructions through the use of state-of-the-art parallel technologies. Namely, we propose a parallel algorithm that we implement and evaluate on multicore CPUs and on many-core GPUs. We evaluate two alternative GPU implementations that use the CUDA and Thrust software platforms and a network of workstations based solution. The latter sports a workload partitioning based on Divisible Load Theory. A rigorous set of experiments highlights the differences of the proposed implementations in terms of execution time and speedup. [GRAPHICS] We aim at reducing the time associated with the construction of the successors of all state pairs of a given non-deterministic finite state machine. We propose a parallel algorithm that we implement and evaluate on multicore CPUs and on many-core GPUs. We evaluate two alternative GPU implementations that use the CUDA and Thrust software platforms. Additionally, we propose and evaluate a Network of Workstations solution based on Divisible Load Theory. A rigorous set of experiments highlights the differences of the proposed implementations in terms of execution time and speedup.
A parallel approach to contour extraction and coding on an Exclusive Read Exclusive Write (EREW) parallel Random Access Machine (PRAM) is presented and analyzed. The algorithm is intended for binary images. The labele...
详细信息
A parallel approach to contour extraction and coding on an Exclusive Read Exclusive Write (EREW) parallel Random Access Machine (PRAM) is presented and analyzed. The algorithm is intended for binary images. The labeled contours can be represented by lists of coordinates, and/or chain codes, and/or any other user designed codes. Using O( n 2 /log n ) processors, the algorithm runs in O(log n ) time, where, n by n is the size of the processed binary image.
暂无评论