检索结果-内蒙古大学图书馆

An Efficient im2row-Based fast convolution algorithm for ARM Cortex-M MCUs

IEEE ACCESS 2021年 9卷 124384-124395页

作者： Wang, Peng Wang, Xiaoqin Luo, Rui Wang, Dingyi Luo, Mengjie Qiao, Shushan Zhou, Yumei Chinese Acad Sci Inst Microelect Beijing 100029 Peoples R China Univ Chinese Acad Sci Sch Elect Elect & Commun Engn Beijing 100049 Peoples R China

With the rise of IoT and edge computing, deploying neural networks (NNs) on low-power edge computing devices is drawing more and more attention. In NNs, convolutional layers take up the majority of the computing cycles, especially when NNs are implemented on ARM processors. Therefore, it is necessary to optimize the convolutional implementation on ARM Cortex-M MCUs. This paper proposes an efficient im2row-based fast convolution algorithm with two innovations. First, a novel im2row method for reusing the data of adjacent convolutional windows is presented. This method utilizes a reusable im2row buffer for data reuse, significantly reducing the amount of data copied during im2row and improving efficiency. Second, in algorithm implementation, a q7_t to q15_t data type extension technique that avoids data reordering is employed. This technique eliminates data reordering instructions, thus reducing the runtime of the algorithm. We evaluate our algorithm in separate convolutional layers and NNs. The results for convolutional layers show that, compared to baseline, the proposed algorithm speeds up the convolutional layer by an average of 1.42x, and the maximum speedup is up to 2.9x. Experiments on different NNs demonstrate that our algorithm can speed up the overall NN by up to 2.15x.

关键词： im2row (or im2col) fast convolution algorithm edge AI embedded software ARM Cortex-M

来源：评论

学校读者我要写书评

暂无评论

A second-order energy stable and nonuniform time-stepping scheme for time fractional Burgers' equation

引用

COMPUTERS & MATHEMATICS WITH APPLICATIONS 2022年第0期123卷 227-240页

作者： Shen, Jin-ye Ren, Jincheng Chen, Shanzhen Southwestern Univ Finance & Econ Sch Math Chengdu 611130 Sichuan Peoples R China Henan Univ Econ & Law Coll Math & Informat Sci Zhengzhou 450046 Henan Peoples R China

In this paper, an effective finite difference scheme of high order accuracy is proposed for the nonlinear time fractional Burgers' equation. Specifically, we apply the Alikhanov's scheme on graded mesh in the temporal direction and a novel fourth-order compact scheme in the spatial discretization. The proposed scheme resolves initial weak singularity of the solution and preserves high resolution in the space direction. It is rigorously proved that the finite difference scheme is uniquely solvable, discrete variational energy dissipation law and unconditionally stable and convergent in sense of discrete L-2-norm. With appropriate choice of the grading parameter, the convergence accuracy is min?{r alpha, 2} order in time and fourth order in space, where r is the mesh grading. In the numerical implementation procedure, fast convolution technique and adaptive time-stepping strategy are adopted to accelerate the presented solver and to capture evolution of the solution. Numerical experiments are carried out to verify the validity and effectiveness of the proposed scheme for solving nonlinear time-fractional Burgers' equation.

关键词： Fractional Burgers' equation Weak singularity A second-order scheme Graded mesh fast convolution algorithm

来源：评论

学校读者我要写书评

暂无评论

High accuracy error estimates of a Galerkin finite element method for nonlinear time fractional diffusion equation

引用

NUMERICAL METHODS FOR PARTIAL DIFFERENTIAL EQUATIONS 2020年第2期36卷 284-301页

作者： Ren, Jincheng Shi, Dongyang Vong, Seakweng Henan Univ Econ & Law Coll Math & Informat Sci Zhengzhou 450045 Henan Peoples R China Zhengzhou Univ Sch Math & Stat Zhengzhou Peoples R China Univ Macau Dept Math Macau Peoples R China

In this work, an effective and fast finite element numerical method with high-order accuracy is discussed for solving a nonlinear time fractional diffusion equation. A two-level linearized finite element scheme is constructed and a temporal-spatial error splitting argument is established to split the error into two parts, that is, the temporal error and the spatial error. Based on the regularity of the time discrete system, the temporal error estimate is derived. Using the property of the Ritz projection operator, the spatial error is deduced. Unconditional superclose result in H-1-norm is obtained, with no additional regularity assumption about the exact solution of the problem considered. Then the global superconvergence error estimate is obtained through the interpolated postprocessing technique. In order to reduce storage and computation time, a fast finite element method evaluation scheme for solving the nonlinear time fractional diffusion equation is developed. To confirm the theoretical error analysis, some numerical results are provided.

关键词： fast convolution algorithm Galerkin finite element method nonlinear time fractional diffusion equation superconvergent result

来源：评论

学校读者我要写书评

暂无评论

An efficient FDTD algorithm for 2D/3D time fractional Maxwell's system

引用

APPLIED MATHEMATICS LETTERS 2021年 116卷 106992-106992页

作者： Bai, Xixian Rui, Hongxing Shandong Univ Sch Math Jinan 250100 Peoples R China

A fast, easily implemented and high efficiency algorithm for time fractional Maxwell's system is constructed. The algorithm is based on recently developed the sum-of-exponentials (SOE) approximation and Finite-Difference Time-Domain (FDTD) method. A particular feature of our proposed algorithm is that it can achieve high efficiency with no loss in accuracy. The computing process of our algorithm in detail is derived. Numerical experiments in 2D and 3D are presented to verify the efficiency and correctness of our proposed algorithm. (c) 2020 Elsevier Ltd. All rights reserved.

关键词： Maxwell's equations Dispersive medium Cole-Cole model Finite-Difference Time-Domain Method fast convolution algorithm

来源：评论

学校读者我要写书评

暂无评论

fast evaluation and high accuracy finite element approximation for the time fractional subdiffusion equation

引用

NUMERICAL METHODS FOR PARTIAL DIFFERENTIAL EQUATIONS 2018年第2期34卷 705-730页

作者： Ren, Jincheng Mao, Shipeng Zhang, Jiwei Henan Univ Econ & Law Coll Math & Informat Sci Zhengzhou Henan Peoples R China Chinese Acad Sci Inst Computat Math & Sci Engn Comp LSEC AMSS Beijing 100190 Peoples R China Univ Chinese Acad Sci Sch Math Sci Beijing Peoples R China Beijing Computat Sci Res Ctr Beijing Peoples R China

In this article, an efficient algorithm for the evaluation of the Caputo fractional derivative and the superconvergence property of fully discrete finite element approximation for the time fractional subdiffusion equation are considered. First, the space semidiscrete finite element approximation scheme for the constant coefficient problem is derived and supercloseness result is proved. The time discretization is based on the L1-type formula, whereas the space discretization is done using, the fully discrete scheme is developed. Under some regularity assumptions, the superconvergence estimate is proposed and analyzed. Then, extension to the case of variable coefficients is also discussed. To reduce the computational cost, the fast evaluation scheme of the Caputo fractional derivative to solve the fractional diffusion equations is designed. Finally, numerical experiments are presented to support the theoretical results.

关键词： fast convolution algorithm finite element method fractional subdiffusion equation fully discrete scheme superconvergence estimate

来源：评论

学校读者我要写书评

暂无评论

fast convolution for nonreflecting boundary conditions

引用

SIAM JOURNAL ON SCIENTIFIC COMPUTING 2002年第1期24卷 161-182页

作者： Lubich, C Schädle, A Univ Tubingen Math Inst D-72076 Tubingen Germany

Nonreflecting boundary conditions for problems of wave propagation are nonlocal in space and time. While the nonlocality in space can be efficiently handled by Fourier or spherical expansions in special geometries, the arising temporal convolutions still form a computational bottleneck. In the present article, a new algorithm for the evaluation of these convolution integrals is proposed. To compute a temporal convolution over N-t successive time steps, the algorithm requires O(N-t log N-t) operations and O(log N-t) memory. In the numerical examples, this algorithm is used to discretize the Neumann-to-Dirichlet operators arising from the formulation of nonreflecting boundary conditions in rectangular geometries for Schrodinger and wave equations.

关键词： transparent boundary conditions radiation boundary conditions fast convolution algorithm wave equation Schrodinger equation

来源：评论

学校读者我要写书评

暂无评论

A fast convolution algorithm for biorthogonal wavelet image compression

引用

JOURNAL OF THE CHINESE INSTITUTE OF ENGINEERS 1999年第2期22卷 179-192页

作者： Wu, BF Su, CY Natl Chiao Tung Univ Dept Elect & Control Engn Hsinchu 300 Taiwan Natl Taiwan Normal Univ Dept Ind Educ Taipei 106 Taiwan

Symmetric filters and symmetric extension of image edges have been widely used in wavelet image compression. Since the filters are symmetric, it is possible to take advantage of the symmetric property to reduce the computational complexity for the filtering. In this paper, we present a fast convolution algorithm for the discrete wavelet transform (DWT) and the inverse DWT (IDWT) such that the transform time can be greatly reduced. Compared with regular convolution, the new algorithm can decrease the multiplication operations by nearly one half. Converted into real programming, it sped up the DWT and IDWT in our experiments by at least 12% and 55%, respectively. Incorporated with enhancing zerotree coding, the proposed algorithm results in a rapid and efficient coder. Experimental results showed that the coder is competitive with other high performance coders. The pro posed convolution algorithm is also suitable for many types of wavelet-based coding, including wavelet video coding.

关键词： fast convolution algorithm wavelet image compression symmetric extension zerotree coding

来源：评论

学校读者我要写书评

暂无评论

A new efficient quadratic filter based on the Chen's LMS linear algorithm and its performance analysis

A new efficient quadratic filter based on the Chen's LMS lin...

引用

IEEE International Conference on Acoustics, Speech, and Signal Processing

作者： Sayadi, M Fnaiech, F Sakrani, S Najim, M CEREP ESSTT Tunis 1008 Tunisia

ISBN: (纸本)0780374029

In this paper, we propose an efficient approach based on a fast convolution algorithm to reduce the computational complexity of the Least Mean Square (LMS) adaptive algorithm for the quadratic filter, i.e. the quadratic part of the second order Volterra filter (SOVF). The previous works using the fast convolution in the adaptive LMS filtering are limited to the linear case. We show that this approach reduces the multiplications number by close to 25%, at the expense of only 25% more additions. The steady-state performance of this algorithm is studied for gaussian inputs and in stationary setting. The Steady-State Excess Mean-Square-Error is evaluated, The theoretical performance predictions are shown to be in good agreement with simulation results, especially for small step-sizes.

关键词： adaptive filters least mean squares methods convolution Volterra series Gaussian noise computational complexity quadratic filter LMS linear algorithm performance analysis fast convolution algorithm least mean square adaptive filter second order Volterra filter steady-state performance Gaussian inputs excess mean square error SOVF computational complexity

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：