In this paper, we present a new (4,2)-way toom-cook algorithm using finite field interpolation. The proposed algorithm uses multi-evaluation scheme to construct a digit-serial multiplier over GF (2(m)) which involves ...
详细信息
ISBN:
(纸本)9781479948338
In this paper, we present a new (4,2)-way toom-cook algorithm using finite field interpolation. The proposed algorithm uses multi-evaluation scheme to construct a digit-serial multiplier over GF (2(m)) which involves subquadratic space-complexity. From theoretical analysis, it is found that the proposed architecture has O(m(log4 5)) space complexity and O(m(log4 2)) latency, which is significantly less than traditional digit-serial multipliers.
This article proposes a solution to the factorization problem in cryptographic systems by leveraging the steps of the toom-cook algorithm for large-number multiplication. This approach can factor a 200-bit number, wit...
详细信息
This article proposes a solution to the factorization problem in cryptographic systems by leveraging the steps of the toom-cook algorithm for large-number multiplication. This approach can factor a 200-bit number, with performance varying depending on memory and processing power. Experiments demonstrate that the factorization problem in cryptography can be solved more efficiently by employing algorithms designed for fast and straightforward multiplication of large numbers. Examples include the Sch & ouml;nhage-Strassen algorithm, which is based on polynomials and Fourier transforms, the F & uuml;rer algorithm, the second Sch & ouml;nhage-Strassen algorithm using modular arithmetic, and Karatsuba's algorithm. This advancement significantly impacts modern computing and cryptography, enhancing both security and reliability. The proposed technique was extensively tested through simulations using the MATLAB simulator. Experimental results indicate improvements of 91% in efficiency and 95% in accuracy compared to state-of-the-art techniques.
Popular deep neural networks (DNNs) spend the majority of their execution time computing convolutions. The Winograd family of algorithms can greatly reduce the number of arithmetic operations required and is used in m...
详细信息
Popular deep neural networks (DNNs) spend the majority of their execution time computing convolutions. The Winograd family of algorithms can greatly reduce the number of arithmetic operations required and is used in many DNN software frameworks. However, the performance gain is at the expense of a reduction in floating point (FP) numerical accuracy. In this article, we analyse the worst-case FP error and derive an estimation of the norm and conditioning of the algorithm. We show that the bound grows exponentially with the size of the convolution. Further, the error bound of the modified algorithm is slightly lower but still exponential. We propose several methods for reducing FP error. We propose a canonical evaluation ordering based on Huffman coding that reduces summation error. We study the selection of sampling "points" experimentally and find empirically good points for the most important sizes. We identify the main factors associated with good points. In addition, we explore other methods to reduce FP error, including mixed-precision convolution, and pairwise summation across DNN channels. Using our methods, we can significantly reduce FP error for a given block size, which allows larger block sizes to be used and reduced computation.
Convolutional neural networks (CNNs) have dramatically improved the accuracy of image, video, and audio processing for tasks such as object recognition, image segmentation, and interactive speech systems. CNNs require...
详细信息
Convolutional neural networks (CNNs) have dramatically improved the accuracy of image, video, and audio processing for tasks such as object recognition, image segmentation, and interactive speech systems. CNNs require large amounts of computing resources for both training and inference, primarily because the convolution layers are computationally intensive. Fast convolution algorithms such as Winograd convolution can greatly reduce the computational cost of these layers. However, Winograd convolution has poor numeric properties, such that greater savings in computation cause exponentially increasing floating point errors. A defining feature of each Winograd convolution algorithm is a set of real-value points where polynomials are sampled. The choice of points impacts the numeric accuracy of the algorithm, but the optimal set of points for small convolutions remains unknown. Existing work considers only small integers and simple fractions as candidate points. In this work, we propose a novel approach to point selection using points of the form {-1/c, -c, c, 1/c} using the full range of real-valued numbers for c. We show that groups of this form cause cancellations in the Winograd transform matrices that reduce numeric error. We find empirically that the error for different values of c forms a rough curve across the range of real-value numbers. It is therefore possible to localize the values of c that lead to lower error. We show that it is not necessary to choose integers or simple fractions as evaluation points, and that lower errors can be achieved with non-obvious real-valued points. We study a range of sizes for small convolutions and achieve reduction in error ranging from 2% to around 59% for both 1D and 2D convolution, when compared to state of the art. Furthermore, we identify patterns in cases when we select a subset of our proposed points that will always lead to a lower error. Finally, we implement a complete Winograd convolution layer and use it to run st
暂无评论