This paper presents a parallel algorithm for histogram image template matching using an SIMD array processor with a hypercube interconnection network. For an N/spl times/N image and M/spl times/M template window, the ...
详细信息
This paper presents a parallel algorithm for histogram image template matching using an SIMD array processor with a hypercube interconnection network. For an N/spl times/N image and M/spl times/M template window, the time complexity is shown to be reduced to O(M+log/sup 2/M+logN) as compared to O(N/sup 2/M/sup 2/) for the sequential algorithm, for the N/sup 2/ processing element (PE) multiprocessor systems. Each PE requires only a small local memory. The algorithm is shown to be cost optimal with total cost of computation is O(N/sup 2/M+N/sup 2/log/sup 2/M+N/sup 2/logN).
A heuristic algorithm for the DNA sequence assembly problem is presented. Its sequential implementation is described as well as its parallelization method. A computational experiment shows how the parallel algorithm s...
详细信息
A heuristic algorithm for the DNA sequence assembly problem is presented. Its sequential implementation is described as well as its parallelization method. A computational experiment shows how the parallel algorithm speed depends on a number of processes. Tests on real data from experiments with the SARS coronavirus are also discussed, and the outcome of our algorithm appears to be biologically correct.
In the paper the new approach to design parallel algorithms for the modelling of the multi-scale non-stationary processes is proposed. Our technique is based on the explicit multi-level difference schemes with the loc...
详细信息
Floorplanning is a critical phase in the physical design of VLSI circuits and has been acknowledged as a computation-intensive process. As a result, several research efforts have been undertaken to parallelize the alg...
详细信息
ISBN:
(纸本)0780386477
Floorplanning is a critical phase in the physical design of VLSI circuits and has been acknowledged as a computation-intensive process. As a result, several research efforts have been undertaken to parallelize the algorithm. While previous work has been focused on slicing the floorplan, we present a parallel algorithm for a non-slicing floorplan using corner block list (CBL) topological representation. A parallel interconnection cost calculation algorithm with load balancing strategy is initiated in order to speed up the especially time consuming wire length calculation in floorplanning. A multiple Markov chains strategy is also embedded in our algorithm. The experimental results obtained from the tests on MCNC benchmarks indicate considerable speedup and preserved floorplanning quality.
Group behaviors, e.g. birds flocking, are widely used in virtual reality, computer games, robotics and artificial life. While many methods to simulate group behaviors have been proposed, these methods are usually appl...
详细信息
ISBN:
(纸本)9780780387867
Group behaviors, e.g. birds flocking, are widely used in virtual reality, computer games, robotics and artificial life. While many methods to simulate group behaviors have been proposed, these methods are usually applied to sequential computing. Since the computational load of these methods is exponential to the number of group members, it is difficult to simulate a large group in real-time using these methods. In this paper, we propose a parallel algorithm to simulate the flocking behavior of a large group. The new partitioning and communication mechanisms in the parallel algorithm make the flocking simulation more efficient. Experimental results show that the proposed parallel algorithm provides good speedup in generating flocking behaviors compared with the sequential simulation.
The longest common subsequence problem has been applied to network instruction detection system, bioinformatics and e-commerce, etc. This paper proposes an extended longest common subsequence problem called (K,1)-LCS ...
详细信息
ISBN:
(纸本)0780384032
The longest common subsequence problem has been applied to network instruction detection system, bioinformatics and e-commerce, etc. This paper proposes an extended longest common subsequence problem called (K,1)-LCS problem, designs a parallel algorithm to solve (K,1)-LCS problem on SMP machine by the divide and conquer strategy and the tournament tree, and then presents a parallel algorithm for solving (K,1)-LCS problem on SMP clusters by applying the k-selection technique based on mesh-connected network. The theoretical analysis and experiments on Dawning-2000 parallel computer show that the parallel algorithm obtains a linear speedup and has a very good scalability.
Summary form only given. The tremendous amount of data generated by large-scale, parallel scientific and engineering simulations make the archive and analysis of this data difficult. To address this problem we, in pre...
详细信息
Summary form only given. The tremendous amount of data generated by large-scale, parallel scientific and engineering simulations make the archive and analysis of this data difficult. To address this problem we, in previous work, developed an efficient archival scheme based on the functional representation of simulation data-this approximation scheme can reduce storage requirement significantly. However, common visualization tools such as the marching cubes algorithm for isosurface generation cannot be directly applied in this representation. Thus, we propose a new efficient isosurface visualization approach that takes full advantage of the functional approximation of simulation data. This method is fundamentally different from the marching cubes approach in that the visualization of isosurfaces is achieved through the solution of sets of ordinary differential equations. We present computational results detailing the effectiveness of this new approach for a simulation modeling the fluid dynamics of a turbulent reacting flow. The results demonstrate that the method is efficient in a parallel environment and represents a promising approach for the visualization of isosurfaces in simulation data from large-scale scientific applications.
Summary form only given. QR methods for solving Toeplitz tridiagonal systems are well developed with applications in numerous interdisciplinary fields. There is a strong motivation to develop faster, more efficient an...
详细信息
Summary form only given. QR methods for solving Toeplitz tridiagonal systems are well developed with applications in numerous interdisciplinary fields. There is a strong motivation to develop faster, more efficient and, more importantly, scalable algorithms to factor such systems due to their significance in many scientific applications. We present two parallel QR factorization algorithms used to solve Toeplitz tridiagonal systems. QR factorization is accomplished using Householder reflections and Givens rotations. These parallel algorithms exhibit high scalability and near linear to superlinear speedup on large system sizes when implemented on a distributed system.
For scattering and antenna design problems the electromagnetic field must be computed around and inside 3D complex targets, constituted with inhomogeneous media. To this end, "exact" numerical methods can be...
详细信息
For scattering and antenna design problems the electromagnetic field must be computed around and inside 3D complex targets, constituted with inhomogeneous media. To this end, "exact" numerical methods can be employed to solve Maxwell's equations in the frequency domain. Here, we consider the hybrid boundary integral equation and the finite element technique. For some of our applications, this numerical approach tends to be insufficient in terms of number of degrees of freedom. In order to solve these very large problems, we have combined an original numerical method based on a domain decomposition method and very efficient parallel algorithms.
Heterogeneous clusters claim for new models and algorithms. In this paper a new parallel computational model is presented. The model, based on the LogGP model, has been extended to be able to deal with heterogeneous p...
详细信息
ISBN:
(纸本)9780780384309
Heterogeneous clusters claim for new models and algorithms. In this paper a new parallel computational model is presented. The model, based on the LogGP model, has been extended to be able to deal with heterogeneous parallel systems. For that purpose, the LogGP's scalar parameters have been replaced by vector and matrix parameters to take into account the different node's features. The work presented here includes the parameterization of a real cluster which illustrates the impact of node heterogeneity over the model's parameters. Finally, the paper presents some experiments performed in a real heterogeneous cluster that can be used for assessing the method's validity, together with the main conclusions and future work.
暂无评论