Group behaviors, e.g. birds flocking, are widely used in virtual reality, computer games, robotics and artificial life. While many methods to simulate group behaviors have been proposed, these methods are usually appl...
详细信息
ISBN:
(纸本)9780780387867
Group behaviors, e.g. birds flocking, are widely used in virtual reality, computer games, robotics and artificial life. While many methods to simulate group behaviors have been proposed, these methods are usually applied to sequential computing. Since the computational load of these methods is exponential to the number of group members, it is difficult to simulate a large group in real-time using these methods. In this paper, we propose a parallel algorithm to simulate the flocking behavior of a large group. The new partitioning and communication mechanisms in the parallel algorithm make the flocking simulation more efficient. Experimental results show that the proposed parallel algorithm provides good speedup in generating flocking behaviors compared with the sequential simulation.
The longest common subsequence problem has been applied to network instruction detection system, bioinformatics and e-commerce, etc. This paper proposes an extended longest common subsequence problem called (K,1)-LCS ...
详细信息
ISBN:
(纸本)0780384032
The longest common subsequence problem has been applied to network instruction detection system, bioinformatics and e-commerce, etc. This paper proposes an extended longest common subsequence problem called (K,1)-LCS problem, designs a parallel algorithm to solve (K,1)-LCS problem on SMP machine by the divide and conquer strategy and the tournament tree, and then presents a parallel algorithm for solving (K,1)-LCS problem on SMP clusters by applying the k-selection technique based on mesh-connected network. The theoretical analysis and experiments on Dawning-2000 parallel computer show that the parallel algorithm obtains a linear speedup and has a very good scalability.
Summary form only given. The tremendous amount of data generated by large-scale, parallel scientific and engineering simulations make the archive and analysis of this data difficult. To address this problem we, in pre...
详细信息
Summary form only given. The tremendous amount of data generated by large-scale, parallel scientific and engineering simulations make the archive and analysis of this data difficult. To address this problem we, in previous work, developed an efficient archival scheme based on the functional representation of simulation data-this approximation scheme can reduce storage requirement significantly. However, common visualization tools such as the marching cubes algorithm for isosurface generation cannot be directly applied in this representation. Thus, we propose a new efficient isosurface visualization approach that takes full advantage of the functional approximation of simulation data. This method is fundamentally different from the marching cubes approach in that the visualization of isosurfaces is achieved through the solution of sets of ordinary differential equations. We present computational results detailing the effectiveness of this new approach for a simulation modeling the fluid dynamics of a turbulent reacting flow. The results demonstrate that the method is efficient in a parallel environment and represents a promising approach for the visualization of isosurfaces in simulation data from large-scale scientific applications.
Summary form only given. QR methods for solving Toeplitz tridiagonal systems are well developed with applications in numerous interdisciplinary fields. There is a strong motivation to develop faster, more efficient an...
详细信息
Summary form only given. QR methods for solving Toeplitz tridiagonal systems are well developed with applications in numerous interdisciplinary fields. There is a strong motivation to develop faster, more efficient and, more importantly, scalable algorithms to factor such systems due to their significance in many scientific applications. We present two parallel QR factorization algorithms used to solve Toeplitz tridiagonal systems. QR factorization is accomplished using Householder reflections and Givens rotations. These parallel algorithms exhibit high scalability and near linear to superlinear speedup on large system sizes when implemented on a distributed system.
For scattering and antenna design problems the electromagnetic field must be computed around and inside 3D complex targets, constituted with inhomogeneous media. To this end, "exact" numerical methods can be...
详细信息
For scattering and antenna design problems the electromagnetic field must be computed around and inside 3D complex targets, constituted with inhomogeneous media. To this end, "exact" numerical methods can be employed to solve Maxwell's equations in the frequency domain. Here, we consider the hybrid boundary integral equation and the finite element technique. For some of our applications, this numerical approach tends to be insufficient in terms of number of degrees of freedom. In order to solve these very large problems, we have combined an original numerical method based on a domain decomposition method and very efficient parallel algorithms.
Heterogeneous clusters claim for new models and algorithms. In this paper a new parallel computational model is presented. The model, based on the LogGP model, has been extended to be able to deal with heterogeneous p...
详细信息
ISBN:
(纸本)9780780384309
Heterogeneous clusters claim for new models and algorithms. In this paper a new parallel computational model is presented. The model, based on the LogGP model, has been extended to be able to deal with heterogeneous parallel systems. For that purpose, the LogGP's scalar parameters have been replaced by vector and matrix parameters to take into account the different node's features. The work presented here includes the parameterization of a real cluster which illustrates the impact of node heterogeneity over the model's parameters. Finally, the paper presents some experiments performed in a real heterogeneous cluster that can be used for assessing the method's validity, together with the main conclusions and future work.
A parallel algorithm for prefix computation was reported on a recently proposed interconnection network called optical multi-trees (OMULT). Using 2n/sup 3/-n/sup 2/ processors, the algorithm was shown to run in O(log ...
详细信息
A parallel algorithm for prefix computation was reported on a recently proposed interconnection network called optical multi-trees (OMULT). Using 2n/sup 3/-n/sup 2/ processors, the algorithm was shown to run in O(log n)/sup A/ electronic moves +5 optical moves for n/sup 2/ data points. In this paper we present a new and improved parallel algorithm for prefix computation on the same network. Although the algorithm requires O(log n) electronic moves +4 optical moves using the same number of processors, the number of data points involved in our algorithm is n/sup 3/ in contrast to n/sup 2/.
We study the problem of frequency offset due to the Doppler velocity of orthogonal frequency division multiplexing (OFDM) systems. A regular OFDM has an IFFT at the transmitter and an FFT at the receiver. The proposed...
详细信息
We study the problem of frequency offset due to the Doppler velocity of orthogonal frequency division multiplexing (OFDM) systems. A regular OFDM has an IFFT at the transmitter and an FFT at the receiver. The proposed parallel algorithm employs an FFT at the transmitter and an IFFT at the receiver. The regular OFDM samples, which are generated by the IFFT, are transmitted as the first block, and the FFT outputs are transmitted as the second block. At the receiver, the results of the first output generated by the FFT are combined with the second output generated by the IFFT. This combined operation forms a parallel intercarrier interference (ICI) cancellation scheme for mitigating frequency offsets of OFDM systems. It provides: (1) a robust OFDM system in the presence of relative velocity, also known as Doppler frequency offset; (2) high signal to ICI ratio (7 dB better than that of the linear self-cancellation algorithms at N=1024 and /spl Delta/fT/spl les/1% of subcarrier frequency spacing); (3) backward compatibility with the existing OFDM system; (4) better BER performance in both additive white Gaussian noise (AWGN) and fading channels.
Summary form only given. Smith normal form computation is important in many topics, e.g. group theory and number theory. For matrices over the rings Z and F/sub 2/[x], we introduce a new Smith normal form algorithm, c...
详细信息
Summary form only given. Smith normal form computation is important in many topics, e.g. group theory and number theory. For matrices over the rings Z and F/sub 2/[x], we introduce a new Smith normal form algorithm, called triangular band matrix algorithm, which first computes the Hermite normal form and then step by step the diagonal form and the Smith normal form. In comparison to the Kannan Bachem algorithm, which computes the Smith normal form by alternately computing the Hermite normal form and the left Hermite normal form, the theoretical advantage is, that we only once apply the expensive Hermite normal form step. We parallelize the triangular band matrix algorithm and get a better complexity analysis than for previous parallel algorithms, like the Kannan Bachem algorithm and the Hartley Hawkes algorithm. In the part, which is different to the Kannan Bachem algorithm, the triangular band matrix algorithm leads to a better efficiency and smaller execution times, even for large example matrices.
The aim of the paper is to introduce techniques in order to optimize the parallel execution time of sorting on heterogeneous platforms (processors speeds are related by a constant factor). We develop a constant time t...
详细信息
The aim of the paper is to introduce techniques in order to optimize the parallel execution time of sorting on heterogeneous platforms (processors speeds are related by a constant factor). We develop a constant time technique for mastering processor load balancing and execution time in an heterogeneous environment. We develop an analytical model for the parallel execution time, sustained by preliminary experimental results in the case of a 2-processors systems. The computation of the solution is independent of the problem size. Consequently, there is no overhead regarding the sorting problem.
暂无评论