咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Variance-Guided Structured Spa... 收藏
IEEE Transactions on Artificial Intelligence

Variance-Guided Structured Sparsity in Deep Neural Networks

作     者:Pandit, Mohammad Khalid Banday, Mahroosh 

作者机构:Indian Institute of Technology Delhi Department of Computer Science and Engineering New Delhi110016 India Indian Institute of Technology Delhi Samsung Innovation Lab Bharti School of Telecommunication Technology and Management New Delhi110016 India 

出 版 物:《IEEE Transactions on Artificial Intelligence》 (IEEE. Trans. Artif. Intell.)

年 卷 期:2023年第4卷第6期

页      面:1714-1723页

核心收录:

主  题:Deep neural networks 

摘      要:The success of deep neural networks, especially convolutional neural networks in various applications, has greatly been possible by the presence of an enormous number of learnable parameters. These parameters increase the learning capacity of the model, but at the same time, it also significantly increases the computational and memory costs. This severely hinders the scalability of these models to limited resource environments, such as IoT devices. The majority of the network weights are known to be redundant and can be removed from the network. This article introduces a regularization scheme, which is the combination of structured sparsity regularization and variance regularization. It simultaneously helps to obtain computationally sparse models by making the majority of parameter groups zero and increasing the variance of nonzero groups to compensate for the accuracy drop. We use sparse group lasso, group sparsity variant of 1 (lasso) regularization, which removes redundant connections and unnecessary neurons from the network. For variance regularization, the KL divergence of the current parameter distribution and the target distribution is minimized, which aims to have the concentration of weights toward zero and a high variance of nonzero weights (skewed distribution). To check the effectiveness of the proposed regularizer, the experiments are performed on various benchmark datasets and it is noticed that variance regularization helps to reduce the accuracy drop caused by sparsity regularization. On MNIST, the trainable parameters are reduced from 331 984 (baseline model) to 57 327 and managed to obtain better accuracy than the baseline (99.6%). Also, on Fashion-MNIST, Cifar-10, and ImageNet the proposed scheme achieved state-of-the-art sparsity with almost no drop in accuracy. © 2020 IEEE.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分