Subresultant chains over rings of multivariate polynomials are calculated using a speculative approach based on the Bezout matrix. Our experimental results yield significant speedup factors for the proposed approach a...
详细信息
ISBN:
(数字)9783031147883
ISBN:
(纸本)9783031147883;9783031147876
Subresultant chains over rings of multivariate polynomials are calculated using a speculative approach based on the Bezout matrix. Our experimental results yield significant speedup factors for the proposed approach against comparable methods. The determinant computations are based on fraction-free Gaussian elimination using various pivoting strategies.
A multithreaded solution for the 2D Poisson equation is presented. The proposed algorithm distributes the tasks between threads in Floating-Point Unit (FPU) intensive and non-FPU intensive. This technique also allowed...
详细信息
ISBN:
(纸本)9781618397881
A multithreaded solution for the 2D Poisson equation is presented. The proposed algorithm distributes the tasks between threads in Floating-Point Unit (FPU) intensive and non-FPU intensive. This technique also allowed us to make the communication between nodes asynchronous. Our approach of decoupling communication and computation allows for much greater scalability. This new multithreaded approach showed better performance in all multicore processors tested. In the case of the distributed systems tested, the proposed method had greater speed-up than the classical scheme. The technique Red/Black ordering was found to be effective only if data fit entirely in cache memory.
A multithreaded solution for the 2D Poisson equation is presented. The proposed algorithm distributes the tasks between threads in Floating-Point Unit (FPU) intensive and non-FPU intensive. This technique also allowed...
详细信息
ISBN:
(纸本)9781618397881
A multithreaded solution for the 2D Poisson equation is presented. The proposed algorithm distributes the tasks between threads in Floating-Point Unit (FPU) intensive and non-FPU intensive. This technique also allowed us to make the communication between nodes asynchronous. Our approach of decoupling communication and computation allows for much greater scalability. This new multithreaded approach showed better performance in all multicore processors tested. In the case of the distributed systems tested, the proposed method had greater speed-up than the classical scheme. The technique Red/Black ordering was found to be effective only if data fit entirely in cache memory.
This paper introduces a storage format for sparse matrices, called compressed sparse blocks (CSB), which allows both Ax and A(x)(inverted perpendicular) to be computed efficiently in parallel, where A is an n x n spar...
详细信息
ISBN:
(纸本)9781605586069
This paper introduces a storage format for sparse matrices, called compressed sparse blocks (CSB), which allows both Ax and A(x)(inverted perpendicular) to be computed efficiently in parallel, where A is an n x n sparse matrix with nnz >= n nonzeros and x is a dense n-vector. Our algorithms use Theta(nnz) work (serial running time) and Theta(root nlgn) span (critical-path length), yielding a parallelism of Theta(nnz/root nlgn), which is amply high for virtually any large matrix. The storage requirement for CSB is esssentially the same as that for the more-standard compressed-sparse-rows (CSR) format, for which computing Ax in parallel is easy but A(x)(inverted perpendicular) is difficult. Benchmark results indicate that on one processor, the CSB algorithms for Ax and A(x)(inverted perpendicular) run just as fast as the CSR algorithm for Ax, but the CSB algorithms also scale up linearly with processors until limited by off-chip memory bandwidth.
暂无评论