Matched-field processing (MFP) localizes sources more accurately than plane-wave beamforming by employing full-wave acoustic propagation models for the cluttered ocean environment. The minimum variance distortionless ...
详细信息
Matched-field processing (MFP) localizes sources more accurately than plane-wave beamforming by employing full-wave acoustic propagation models for the cluttered ocean environment. The minimum variance distortionless response MFP (MVDR-MFP) algorithm incorporates the MVDR technique into the MFP algorithm to enhance beamforming performance. Such an adaptive MFP algorithm involves intensive computational and memory requirements due to its complex acoustic model and environmental adaptation. The real-time implementation of adaptive MFP algorithms for large surveillance areas presents a serious computational challenge where high-performance embedded computing and parallel processing may be required to meet real-time constraints. In this paper, three parallel algorithms based on domain decomposition techniques are presented for the MVDR-MFP algorithm on distributed array systems. The parallel performance factors in terms of execution times, communication times, parallel efficiencies, and memory capacities are examined on three potential distributed systems including two types of digital signal processor arrays and a cluster of personal computers. The performance results demonstrate that these parallel algorithms provide a feasible solution for real-time, scalable, and cost-effective adaptive beamforming on embedded, distributed array systems.
Understanding the demixing effect on the dispersion of particles by large-scale turbulence is very important in practical applications. Using pseudospectral and Lagrangian approaches, we have simulated a three-dimensi...
详细信息
Understanding the demixing effect on the dispersion of particles by large-scale turbulence is very important in practical applications. Using pseudospectral and Lagrangian approaches, we have simulated a three-dimensional particle-laden mixing layer under one-way coupling effect. However, the computer resource required to simulate such a two-phase flow with high Reynolds number and two-way momentum coupling effect exceeds the limit of the current single processor. In this paper. the computation of particles and the two-way momentum coupling terms are partitioned in the span-wise direction because particles are distributed most evenly in this direction. The computation of the tree-dimensional flow field is first partitioned into three groups of processors because of the most independence of the computation among the three spatial dimensions. In each group, the domain is then partitioned using two different schemes based on the property of the fast Fourier transformation. The first one, the master slave scheme, is employed for Algorithm MS due to its simplicity and overlapping of communication and computation. The second one, the transpose approach, is used for Algorithm TP order to partition all of the flow field computation. An analysis shows that compared to Algorithm MS, Algorithm TP can also reduce nearly a half of the amount of communication work. Experiments show that Algorithm MS has obtained a speedup of 4.3 using 9 HP workstations and a speedup of 6.4 using 15 nodes of IBM SP-2 for a problem size on the order of 64(3), and Algorithm TP has achieved speedups 44% higher than Algorithm MS. (C) 2001 Elsevier Science.
Rapid advancements in adaptive sonar beamforming algorithms have greatly increased the computation and communication demands on beamforming arrays, particularly for applications that require in-array autonomous operat...
详细信息
Rapid advancements in adaptive sonar beamforming algorithms have greatly increased the computation and communication demands on beamforming arrays, particularly for applications that require in-array autonomous operation. By coupling each transducer node in a distributed array with a microprocessor, and networking them together, embedded parallel processing for adaptive beamformers can significantly reduce execution time, power consumption and cost, and increase scalability and dependability. In this paper, the basic narrowband Minimum Variance Distortionless Response (MVDR) beamformer is enhanced by incorporating broadband processing, a technique to enhance the robustness of the algorithm, and speedup of the matrix inversion task using sequential regression. Using this Robust Broadband MVDR (RB-MVDR) algorithm as a sequential baseline, two novel parallel algorithms are developed and analyzed. Performance results are included, among them execution time, scaled speedup, parallel efficiency, result latency and memory utilization. The testbed used is a distributed system comprised of a cluster of personal computers connected by a conventional network.
With the increase in the design complexity of VLSI systems. there is an ever increasing need for efficient design automation tools. parallel processing could open up the way for substantially faster and cost-effective...
详细信息
With the increase in the design complexity of VLSI systems. there is an ever increasing need for efficient design automation tools. parallel processing could open up the way for substantially faster and cost-effective VLSI design tools. In this paper, we review some of the basic parallel algorithms that have been recently developed to handle problems arising in VLSI routing. We also include some results that have not appeared in the literature before. These results indicate that existing parallel algorithmic techniques can efficiently handle many VLSI routing problems. Our emphasis will be on outlining some of the basic parallel strategies with appropriate pointers to the literature for more details.
Given a graph G = (V, E), the well-known spanning forest problem of G can be viewed as the problem of finding a maximal subset F of edges in G such that the subgraph induced by F is acyclic. Although this problem has ...
详细信息
Given a graph G = (V, E), the well-known spanning forest problem of G can be viewed as the problem of finding a maximal subset F of edges in G such that the subgraph induced by F is acyclic. Although this problem has well-known efficient NC algorithms, its vertex counterpart, the problem of finding a maximal subset ti of vertices in G such that the subgraph induced by U is acyclic, has not been shown to be in NC (or even in RNC) and is not believed to be parallelizable in general. In this paper we present NC algorithms for solving the latter problem for two special cases. First, we show that, for a planar graph with n vertices, the problem can be solved in O(log(3) n) time with O(n) processors on an EREW PRAM. Second, we show that the problem is solvable in NC if the input graph G has only vertex-induced paths of length polylogarithmic in the number of vertices of G. As a consequence of this result, we show that certain natural extensions of the well-studied maximal independent set problem remain solvable in NC. Moreover, we show that, for a constant-degree graph with n vertices, the problem can be solved in O(root n log(3) n) time with O(n(2)) processors on an EREW PRAM.
In this paper O(log log n) time parallel algorithms with linear work have been obtained on COMMON (or TOLERANT) CRCW PRAM for finding connected and biconnected components of an interval graph;assuming that intervals a...
详细信息
In this paper O(log log n) time parallel algorithms with linear work have been obtained on COMMON (or TOLERANT) CRCW PRAM for finding connected and biconnected components of an interval graph;assuming that intervals are given in sorted (or pad-sorted) order. The algorithms take O(log n) time (with linear work) on a tree machine. k-connectivity of an interval graph can be tested and disconnecting sets found in O(k log log n) and O(k log n) time on a CRCW PRAM and Tree machine (respectively). In case the end-points are integers in range 1...n(O(1)), the algorithms use linear work and take O(k log(*) n) time on PRIORITY write PRAM and O(log log log n) time on TOLERANT or COMMON PRAM. In this case, the assumption that end-points are sorted can be done away with, at cost of randomisation.
Markov chain Monte Carlo (MCMC) implementations of Bayesian inference for latent spatial Gaussian models are very computationally intensive, and restrictions on storage and computation time are limiting their applicat...
详细信息
Markov chain Monte Carlo (MCMC) implementations of Bayesian inference for latent spatial Gaussian models are very computationally intensive, and restrictions on storage and computation time are limiting their application to large problems. Here we propose various parallel MCMC algorithms for such models. The algorithms' performance is discussed with respect to a simulation study, which demonstrates the increase in speed with which the algorithms explore the posterior distribution as a function of the number of processors. We also discuss how feasible problem size is increased by use of these algorithms.
Given two finite sets of points in a plane, the polygon separation problem is to construct a separating convex k-gon with smallest k. In this paper, we present a parallel algorithm for the polygon separation problem. ...
详细信息
Given two finite sets of points in a plane, the polygon separation problem is to construct a separating convex k-gon with smallest k. In this paper, we present a parallel algorithm for the polygon separation problem. The algorithm runs in O(log n) time on a CREW PRAM with n processors, where n is the number of points in the two given sets. The algorithm is cost-optimal, since OMEGA(n log n) is a lower-bound for the time needed by any sequential algorithm. We apply this algorithm to the problem of finding a convex polygon, with the minimal number of edges, for which a given convex region is its digital image. The algorithm in this paper constructs one such polygon with possibly two more edges than the minimal one.
parallel algorithms for finding a fundamental set of cycles of a graph, for locating the bridges of a connected graph and for strongly orienting a bridgeless connected graph are proposed in this paper. Each of these a...
详细信息
parallel algorithms for finding a fundamental set of cycles of a graph, for locating the bridges of a connected graph and for strongly orienting a bridgeless connected graph are proposed in this paper. Each of these algorithms runs intime and requires 0(n(m–n+ 1)) processors on a shared memory model of a single instruction-stream multiple data-stream computer, wheremandnrefer respectively to the number of arcs and the number nodes of the underlying graph. The running time of the algorithms is reduced toprovided 0(n3) processors are used, wheredrefers to the diameter of the graph.
In this paper, parallel algorithms suitable for the iterative solution of large sets of linear equations are developed. The algorithms based on the well known Gauss Seidel and SOR methods are presented in both synchro...
详细信息
In this paper, parallel algorithms suitable for the iterative solution of large sets of linear equations are developed. The algorithms based on the well known Gauss Seidel and SOR methods are presented in both synchronous and asynchronous forms. Results obtained using the M.I.M.D. computer at Loughborough University are given, for the model problem of the solution of the Laplace equation within the unit square.
暂无评论