We describe a parallel algorithm using the BSP/CGM model (Bulk Synchronous parallel/Coarse Grained Multicomputer) to obtain the Euler tours in graphs. It is based on the PRAM (parallel random access machine) algorithm...
详细信息
We describe a parallel algorithm using the BSP/CGM model (Bulk Synchronous parallel/Coarse Grained Multicomputer) to obtain the Euler tours in graphs. It is based on the PRAM (parallel random access machine) algorithm by Caceres et al. For an input graph of n vertices and m edges, the algorithm requires local computation time of O((m+n)/p), O((m+n'p) memory and O(logp) communication rounds, where p is the number of processors. To our knowledge there are no other parallel algorithms under the coarse-grained models for the Euler tours in graphs. The proposed algorithm is implemented using MPI (message passing interface) and the C language. The parallel program runs on a Beowulf with 66 nodes. The implementation results confirm the theoretical complexity results of the algorithm.
Applications of stable matching in switch scheduling have been proposed. However, the classical GS (Gale and Shapley) stable matching algorithm is infeasible for high-speed implementation due to its high complexity. I...
详细信息
Applications of stable matching in switch scheduling have been proposed. However, the classical GS (Gale and Shapley) stable matching algorithm is infeasible for high-speed implementation due to its high complexity. Instead, acyclic stable matching algorithms have been shown useful in implementing scheduling for high-speed switches/routers. We model the acyclic stable matching problem as the dominating set problem for a rooted dependency graph, and propose a parallel algorithm for finding the dominating set in O(n log n) time. We design and implement a scheduler based on the proposed algorithm in hardware. Simulation results show that the number of 2-input NAND gates and the timing of our design are proportional to n/sup 2/ and n respectively, making it feasible to implement the scheduler at high speed with current CMOS technologies.
This paper describes a new algorithm for packet classification using the concept of independent sets. The algorithm has very small memory requirements. The search speed is neither sensitive to the rule table nor to th...
详细信息
This paper describes a new algorithm for packet classification using the concept of independent sets. The algorithm has very small memory requirements. The search speed is neither sensitive to the rule table nor to the percentage of wildcards in the fields. It also scales well from two dimensional classifiers to high dimensional ones. In particular, the algorithm is inherently parallel. Hardware tailored to this algorithm can achieve very fast search speed.
Given two strings X and Y of lengths m and n, respectively, the all-substrings longest common subsequence (ALCS) problem obtains the lengths of the subsequences common to X and any substring of Y. The sequential algor...
详细信息
Given two strings X and Y of lengths m and n, respectively, the all-substrings longest common subsequence (ALCS) problem obtains the lengths of the subsequences common to X and any substring of Y. The sequential algorithm takes O(mn) time and O(n) space. We present a parallel algorithm for ALCS on a coarse-grained multicomputer (BSP/CGM) model with p < /spl radic/m processors that takes O(mn/p) time and O(n/spl radic/m) space per processor, with O(log p) communication rounds. The proposed parallel algorithm also solves the well-known LCS problem. To our knowledge this is the best BSP/CGM algorithm for the ALCS problem in the literature.
Many approaches to slicing rely upon the 'fact' that the union of two static slices is a valid slice. It is known that static slices constructed using program dependence graph algorithms are valid slices (Reps...
详细信息
Many approaches to slicing rely upon the 'fact' that the union of two static slices is a valid slice. It is known that static slices constructed using program dependence graph algorithms are valid slices (Reps and Yang, 1988). However, this is not true for other forms of slicing. For example, it has been established that the union of two dynamic slices is not necessarily a valid dynamic slice (Hall, 1995). In this paper this result is extended to show that the union of two static slices is not necessarily a valid slice, based on Weiser's definition of a (static) slice. We also analyse the properties that make the union of different forms of slices a valid slice.
In this paper, an efficient FFT-based algorithm is proposed for fast computation of discrete wavelet transform (DWT). By virtue of the Fourier-space operations, significant saving in computational complexity is achiev...
详细信息
In this paper, an efficient FFT-based algorithm is proposed for fast computation of discrete wavelet transform (DWT). By virtue of the Fourier-space operations, significant saving in computational complexity is achieved. The Fourier-domain subsampling results in single IFFT operation at the intermediate level thus reducing the computational burden. The Fourier-domain subsampling and the Hermitian symmetry property of the Fourier-transform of a real function provides significant reduction in overall computations. The comparison of the computational complexity of an FFT-based fast wavelet transform (FWT) algorithm with the proposed algorithm for various cases of decomposition levels and wavelet kernel size is done. We found that the proposed algorithm reduces the number of multiplications per point by 22% and by 35% in the case of additions for 'db8' wavelet at a of five. This gain is of practical interest in computationally intensive applications.
Some problems of mining association rules with linguistic terms are discussed. First, an incremental updating algorithm of association rules with linguistic terms is presented. The collection of frequent linguistic at...
详细信息
ISBN:
(纸本)0769520383
Some problems of mining association rules with linguistic terms are discussed. First, an incremental updating algorithm of association rules with linguistic terms is presented. The collection of frequent linguistic attribute sets and its negative border along with their support count are maintained, which makes scan the entire database once at most in the process of updating association rules. The experiment shows that the updating algorithm can not only update association rules effectively but also avoid the repeated cost. Secondly, the parallel algorithm for mining association rules with linguistic terms is presented. The Boolean parallel mining algorithm is improved to discover frequent linguistic attribute sets, and the association rules with at least confidence are generated on all processors. This parallel mining algorithm has fine scale-up, size-up and speed-up.
Digital modulation based on FSK is widely used in HF data communication. This is due to simplicity in implementation by noncoherent detection and robustness due noise and phase synchronization error. Hardware based de...
详细信息
Digital modulation based on FSK is widely used in HF data communication. This is due to simplicity in implementation by noncoherent detection and robustness due noise and phase synchronization error. Hardware based design using FPGA can reduced system size. The proposed modulation integrates both the transmitter and receiver modules into a single FPGA. Further reduction in components is achieved by adopting a multiplierless and parallel algorithm at the receiver module. This is proven by comparing with conventional noncoherent detection algorithm.
Linear and nonlinear convection-diffusion problems are considered. The numerical solution of these problems via the Schwarz alternating method is studied A new class of parallel asynchronous iterative methods with fle...
详细信息
Linear and nonlinear convection-diffusion problems are considered. The numerical solution of these problems via the Schwarz alternating method is studied A new class of parallel asynchronous iterative methods with flexible communication is applied. The implementation of parallel asynchronous and synchronous algorithms on distributed memory multiprocessors is described. Experimental results obtained on an IBM SP2 by using PVM are presented and analyzed. The interest of asynchronous iterative methods with flexible communication is clearly shown.
The present work is part of the European Commission project ADUMS (IST-2001-34088) aimed at developing a fully digital 4D ultrasound system for medical imaging applications. The system is based on adaptive beamforming...
详细信息
The present work is part of the European Commission project ADUMS (IST-2001-34088) aimed at developing a fully digital 4D ultrasound system for medical imaging applications. The system is based on adaptive beamforming, which provides better resolution with respect to conventional beamforming. The parallel algorithm adopted allows for decomposing the planar array beamforming into two linear steps, and for implementation on a parallel computing architecture. The present paper describes the testing results of our adaptive beamforming structure, on both simulated and real data. Comparisons with conventional beamforming, under the same operating conditions are also shown.
暂无评论