文献详情 >Entropic gradient descent algo... 收藏

Entropic gradient descent algorithms and wide flat minima*

作者：Pittorino, Fabrizio Lucibello, Carlo Feinauer, Christoph Perugini, Gabriele Baldassi, Carlo Demyanenko, Elizaveta Zecchina, Riccardo

作者机构：Bocconi Univ Inst Data Sci & Analyt AI Lab I-20136 Milan Italy Politecn Torino Dept Appl Sci & Technol I-10129 Turin Italy

出版物：《JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT》 (J. Stat. Mech. Theory Exp.)

年卷期：2021年第2021卷第12期

核心收录：

学科分类：07[理学] 070201[理学-理论物理] 0702[理学-物理学] 0801[工学-力学（可授工学、理学学位）]

基　　金：European Research Council European Research Council (ERC) Funding Source: European Research Council (ERC)

主　　题：deep learning machine learning message-passing algorithms

摘要：The properties of flat minima in the empirical risk landscape of neural networks have been debated for some time. Increasing evidence suggests they possess better generalization capabilities with respect to sharp ones. In this work we first discuss the relationship between alternative measures of flatness: the local entropy, which is useful for analysis and algorithm development, and the local energy, which is easier to compute and was shown empirically in extensive tests on state-of-the-art networks to be the best predictor of generalization capabilities. We show semi-analytically in simple controlled scenarios that these two measures correlate strongly with each other and with generalization. Then, we extend the analysis to the deep learning scenario by extensive numerical validations. We study two algorithms, entropy-stochastic gradient descent and replicated-stochastic gradient descent, that explicitly include the local entropy in the optimization objective. We devise a training schedule by which we consistently find flatter minima (using both flatness measures), and improve the generalization error for common architectures (e.g. ResNet, EfficientNet).

本地馆藏 | 借阅须知 | 我要预约

已订购，未入库

sda

目录详情 | 试阅读 |

读者评论与其他读者分享你的观点

学校读者

用户名:未登录

我的评分

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

Entropic gradient descent algorithms and wide flat minima*

读者评论与其他读者分享你的观点

请选择收藏分类：

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

Entropic gradient descent algorithms and wide flat minima*

读者评论 与其他读者分享你的观点

请选择收藏分类： 新增自定义分类 确定 取消

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

读者评论与其他读者分享你的观点

请选择收藏分类：