In this paper. we propose a process grid free algorithm for a massively parallel dense symmetric eigensolver with a communication splitting multicasting algorithm. In this algorithm, a tradeoff exists between speed an...
详细信息
ISBN:
(纸本)9783642193279
In this paper. we propose a process grid free algorithm for a massively parallel dense symmetric eigensolver with a communication splitting multicasting algorithm. In this algorithm, a tradeoff exists between speed and memory space to keep the Householder vectors. As a result of a performance evaluation with the T2K Open Supercomputer (U. Tokyo) and the RX200S5, we obtain the performance with 0.86x and 0.95x speed-downs and 1/2 memory space compared to the conventional algorithm for a square process grid. We also show a new algorithm for small-sized matrices in massively parallel processing that takes an appropriately small value of p of the process grid p x q. In this case, the execution time of inverse transformation is negligible.
We investigate schemes to accelerate the decay of aircraft trailing vortices. These structures are susceptible to several instabilities that lead to their eventual destruction. We employ an Evolution Strategy to desig...
详细信息
ISBN:
(纸本)9783642193279
We investigate schemes to accelerate the decay of aircraft trailing vortices. These structures are susceptible to several instabilities that lead to their eventual destruction. We employ an Evolution Strategy to design a lift distribution and a lift perturbation scheme that minimize the wake hazard as proposed in [6]. The performance of a scheme is measured as the reduction of the mean rolling moment that would be induced on a following aircraft;it is computed by means of a Direct Numerical Simulation using a parallel vortex particle code. We find a configuration and a perturbation scheme characterized by an intermediate wavelength lambda similar to 4.64, necessary to trigger medium wavelength instabilities between tail and flap vortices and subsequently amplify long wavelength modes.
In this paper, a coupling strategy of the Parareal algorithm with the Waveform Relaxation method is presented for the parallel solution of differential algebraic equations. The classical Waveform Relaxation (in space)...
详细信息
ISBN:
(纸本)9780769544151
In this paper, a coupling strategy of the Parareal algorithm with the Waveform Relaxation method is presented for the parallel solution of differential algebraic equations. The classical Waveform Relaxation (in space) method and the Parareal (in time) method are first recalled, followed by the introduction of a coupled Parareal-Waveform Relaxation method recently introduced for the solution of partial differential equations. Here, this coupled method is extended to the solution of differential algebraic equations. Numerical experiments, performed on parallel multicores architectures, illustrate the impressive performances of this new method.
We study the numerical behavior of heterogeneous systems such as CPU with CPU or IBM Cell processors for some orthogonalization processes. We focus on the influence of the different floating arithmetic handling of the...
详细信息
ISBN:
(纸本)9783642193279
We study the numerical behavior of heterogeneous systems such as CPU with CPU or IBM Cell processors for some orthogonalization processes. We focus on the influence of the different floating arithmetic handling of these accelerators with Gram-Schmidt orthogonalization using single and double precision. We observe for dense matrices a loss of at worst 1 digit for CUDA-enabled GPUs as well as a speed-up of 20x, and 2 digits for the Cell processor for a 7x speed-up. For sparse matrices, the result between CPU and CPU is very close and the speed-up is 10x. We conclude that the Cell processor is a good accelerator for double precision because of its full IEEE compliance, and not sufficient for single precision applications. The CPU speed-up is better than Cell and the decent IEEE support delivers results close to the CPU ones for both precisions.
Peer-to-Peer (P2P) computing, the harnessing of idle compute cycles through Internet, offers new research challenges in the domain of distributedcomputing. In this paper, we propose an efficient computing resource di...
详细信息
ISBN:
(纸本)9783642193279
Peer-to-Peer (P2P) computing, the harnessing of idle compute cycles through Internet, offers new research challenges in the domain of distributedcomputing. In this paper, we propose an efficient computing resource discovery mechanism based on a balanced multi-way tree structure capable of supporting both exact and range queries, efficiently. Likewise, a rebalancing algorithm is proposed. By means of simulation, we evaluated our proposal in relation to other approaches of the literature. Our results reveal the good performance of our proposals.
The finite elements are extensively utilized to solve various problems in engineering fields with the growth of computing technologies. However, there is a lack of methodology for analyses of huge assembled structures...
详细信息
ISBN:
(纸本)9780878492411
The finite elements are extensively utilized to solve various problems in engineering fields with the growth of computing technologies. However, there is a lack of methodology for analyses of huge assembled structures. The mechanics on the interface of each components, for instance, contact, bolt joint and welding in assembly is a key issue for important huge structure such as nuclear power plants. On the other hand, it is well known that as finite element models become large and complex, construction of detailed mesh becomes a bottleneck in the CAE procedures. To solve these problems, the authors would like to introduce component-wise meshing approach and bonding strategy on the interface of components. In order to assemble component-wise meshes, the penalty method is introduced not only to constrain the displacements, but also to introduce classical spring connection on the joint interface, although penalty method is claimed that it is not suitable for iterative solver. In this paper, the convergence performance of an iterative solver with penalty method is investigated and the detailed component-wise distributed computation scheme is described with numerical examples.
In this paper, an original parallel domain decomposition method for ray-tracing is proposed to solve numerical acoustic problems on multi-cores and multi-processors computers. A hybrid method between the ray-tracing a...
详细信息
ISBN:
(纸本)9780769544151
In this paper, an original parallel domain decomposition method for ray-tracing is proposed to solve numerical acoustic problems on multi-cores and multi-processors computers. A hybrid method between the ray-tracing and the beam-tracing method is first introduced. Then, a new parallel method based on domain decomposition principles is proposed. This method allows to handle large scale open domains for parallelcomputing purpose, better than other existing methods. parallel numerical experiments, carried out on a real world problem-namely the acoustic pollution analysis within a large city-illustrate the performance of this new domain decomposition method.
In this paper,an original parallel domain decomposition method for ray-tracing is proposed to solve numerical acoustic problems on multi-cores and multi-processors computers.A hybrid method between the ray-tracing and...
详细信息
In this paper,an original parallel domain decomposition method for ray-tracing is proposed to solve numerical acoustic problems on multi-cores and multi-processors computers.A hybrid method between the ray-tracing and the beamtracing method is first ***,a new parallel method based on domain decomposition principles is *** method allows to handle large scale open domains for parallelcomputing purpose,better than other existing *** numerical experiments,carried out on a real world problem—namely the acoustic pollution analysis within a large city—illustrate the performance of this new domain decomposition method.
暂无评论