Matrix eigenvalue theory has become an important analysis tool in scientific computing. Sometimes, people do not need to find all eigenvalues but only the maximum eigenvalue. Existing algorithms of finding the maximum...
详细信息
ISBN:
(数字)9783319111940
ISBN:
(纸本)9783319111940;9783319111933
Matrix eigenvalue theory has become an important analysis tool in scientific computing. Sometimes, people do not need to find all eigenvalues but only the maximum eigenvalue. Existing algorithms of finding the maximum eigenvalue of matrices are implemented sequentially. Withthe increasing of the orders of matrices, the workload of calculation is getting heavier. therefore, traditional sequential methods are unable to meet the need of fast calculation for large matrices. this paper proposes a parallel algorithm named PA-ST to find the maximum eigenvalue of positive matrices by using similarity transformation which is implemented by CUDA (Computer Unified Device Architecture) on GPU (Graphic Process Unit). To the best of our knowledge, this is the first CUDA based parallel algorithm of calculating maximum eigenvalue of matrices. In order to improve the performance, optimization techniques are applied in this paper such as using the shared memory rather than the global memory to improve the speed of computation, avoiding bank conflicts by setting the span index, satisfying the principle of coalesced memory access, and by using single-precision floating-point arithmetic and the pinned memory to reduce the copy operation and obtain higher data transfer bandwidth between the host and the GPU device. the experimental results show that the similarity transformation technique can significantly shorten the running time compared to the sequential algorithm and the speedup ratio is nearly stable when the number of iterations increases. As the matrix order increases, the running time of the sequential algorithm and PA-ST increases correspondingly. Experiments also show that the speedup ratio of the PA-ST is between 2.85 and 35.028.
Many scientific applications are described through workflow structures. Due to the increasing level of parallelism offered by modern computing infrastructures, workflow applications now have to be composed not only of...
详细信息
To provide timely results for ‘Big Data Analytics’, it is crucial to satisfy deadline requirements for MapReduce jobs in production environments. In this paper, we propose a deadline-oriented task scheduling approac...
详细信息
the rapidly increased data size make large scale scientific database often have a huge time delay between loading data into the system and ready for receiving query request. To solve this problem, we proposed an effic...
详细信息
Compared with tradition disk, NAND Flash has advantages of higher performance and shock resistance. But before write, NAND Flash must erase the old messages. that why NAND Flash based Solid State Disks (SSDs) always u...
详细信息
暂无评论