compensated algorithms consist in computing the rounding errors of individual operations and then adding them later on to the computed result. This makes it possible to increase the accuracy of the computed result eff...
详细信息
compensated algorithms consist in computing the rounding errors of individual operations and then adding them later on to the computed result. This makes it possible to increase the accuracy of the computed result efficiently. Computing the rounding error of an individual operation is possible through the use of a so-called error-free transformation. In this article, we show that it is possible to validate the result of compensated algorithms using stochastic arithmetic. We study compensated algorithms for summation, dot product and polynomial evaluation. We prove that the use of the random rounding mode inherent to stochastic arithmetic does not change much the accuracy of compensated methods. This is due to the fact that error-free transformations are no more exact but still sufficiently accurate to improve the numerical quality of results. (C) 2018 Elsevier Inc. All rights reserved.
compensated algorithms consist in computing the rounding errors of individual operations and then adding them later on to the computed result. This makes it possible to increase the accuracy of the computed result eff...
详细信息
compensated algorithms consist in computing the rounding errors of individual operations and then adding them later on to the computed result. This makes it possible to increase the accuracy of the computed result efficiently. Computing the rounding error of an individual operation is possible through the use of a so-called error-free transformation. In this article, we show that it is possible to use compensated algorithms for having tight interval inclusions. We study compensated algorithms for summation, dot product and polynomial evaluation. We prove that the use of directed rounding makes it possible to get narrow inclusions with compensated algorithms. This is due to the fact that error-free transformations are no more exact but still sufficiently accurate to improve the numerical quality of results.
compensated summation algorithms are designed to improve the accuracy of ill-conditioned sums. They are based on algorithms, such as FastTwoSum, which are proved to provide, with rounding to nearest, the sum of two fl...
详细信息
compensated summation algorithms are designed to improve the accuracy of ill-conditioned sums. They are based on algorithms, such as FastTwoSum, which are proved to provide, with rounding to nearest, the sum of two floating-point numbers and the associated rounding error. Discrete stochastic arithmetic enables one to estimate rounding error propagation in numerical codes. It requires a random rounding mode which consists in rounding each computed result toward -infinity or +infinity with the same probability. In this paper we analyse the impact of this random rounding mode on compensated summations based on the FastTwoSum algorithm. We show the accuracy improvement obtained using such compensated summations in numerical simulations controlled with discrete stochastic arithmetic.
The aim of the paper is to improve parallel algorithms that obtain higher precision in floating point reduction-type operations while working within the basic floating point type. The compensated parallel variants of ...
详细信息
The aim of the paper is to improve parallel algorithms that obtain higher precision in floating point reduction-type operations while working within the basic floating point type. The compensated parallel variants of summation and dot product operations for floating point vectors are considered (level 1 BLAS operations). The methods are based on the work of Rump, Ogita and Oishi. Parallel implementations in block and pairwise reduction variants are under consideration. Analytical error bounds are obtained for real and complex-valued vectors that are represented by floating point numbers according to the IEEE 754 (IEC 60559) standard for all variants of parallel algorithms. The algorithms are written in C++ Compute Unified Device Architecture (CUDA) for Graphics Processing Units (GPUs) and their accuracy is tested for different vector sizes and different condition numbers. The suggested compensated variant is compared to the multiple-precision library for GPUs in terms of efficiency. The designed algorithms are tested in Krylov-type matrix-based methods with preconditioners that originate from different challenging computational problems. It is shown that the compensated variant of algorithms allows one to accelerate convergence and obtain more accurate results even when the matrix operations are in base precision.(C) 2022 Elsevier B.V. All rights reserved.
A parallel algorithm for accurate polynomial evaluation is proposed for SIMD architectures. This is a parallelized version of the compensated Horner scheme using error-free transformations. The proposed parallel algor...
详细信息
ISBN:
(纸本)9798350319224
A parallel algorithm for accurate polynomial evaluation is proposed for SIMD architectures. This is a parallelized version of the compensated Horner scheme using error-free transformations. The proposed parallel algorithm in this paper is fast and is designed to achieve a result as if computed in twice the working precision and then rounded to the working precision. Numerical results are presented showing the performance of this new parallel algorithm.
HPC simulations suffer from failures of numerical reproducibility because of floating-point arithmetic peculiarities. Different computing distributions of a parallel computation may yield different numerical results. ...
详细信息
ISBN:
(纸本)9781509016150
HPC simulations suffer from failures of numerical reproducibility because of floating-point arithmetic peculiarities. Different computing distributions of a parallel computation may yield different numerical results. We are interested in a finite element computation of hydrodynamic simulations within the openTelemac software where parallelism is provided by domain decomposition. One main task in a finite element simulation consists in building one large linear system and to solve it. Here the building step relies on element-by-element storage mode and the solving step applies the conjugated gradient algorithm. The subdomain parallelism is merged within these steps. We study why reproducibility fails in this process and which operations have to be corrected. We detail how to use compensation techniques to compute a numerically reproducible resolution. We illustrate this approach with the reproducible version of one test case provided by the openTelemac software suite.
暂无评论