In numerous tasks, deep networks are state of the art. However, they are still not well understood from a statistical point of view. In this article, we try to contribute to filling this gap, and we consider regressio...
详细信息
In numerous tasks, deep networks are state of the art. However, they are still not well understood from a statistical point of view. In this article, we try to contribute to filling this gap, and we consider regression models involving deep multilayer perceptrons (MLP) with rectified linear (relu) functions for activation units. It is a difficult task to study the statistical properties of such models. The main reason is that in practice these models may be heavily overparameterized. For the sake of simplicity, we focus here on the sum of square errors (SSE) cost function which is the standard cost function for regression purposes. In this framework, we study the asymptotic behavior of the difference between the SSE of estimated models and the SSE of the theoretical best model. This behavior gives us information on the overfitting properties of such models. We use in this paper new methodology introduced to deal with models with a loss of identifiability, i.e. in the case that the true parameter cannot be identified uniquely. Hence, we don't have to assume that a unique parameter vector realizes the best regression function which seems to be a too strong assumption for heavily overparameterized models. Our results shed new light on the overfitting behavior of MLP models. (C) 2019 Elsevier B.V. All rights reserved.
The authors present a novel gradient descent algorithm called RAPIDO for deep learning. It adapts over time and performs optimisation using current, past and future information similar to the PID controller. The propo...
详细信息
The authors present a novel gradient descent algorithm called RAPIDO for deep learning. It adapts over time and performs optimisation using current, past and future information similar to the PID controller. The proposed method is suited for optimising deep neural networks that consist of activation functions such as sigmoid, hyperbolic tangent and relu functions because it can adapt appropriately to sudden changes in gradients. They experimentally study the authors' method and show the performance results by comparing with other methods on the quadratic objective function and the MNIST classification task. The proposed method shows better performance than the other methods.
Comparisons or Inequality Tests are an essential building block of Rectified Linear Unit functions (relu's), ever more present in Machine Learning, specifically in Neural Networks. Motivated by the increasing inte...
详细信息
ISBN:
(纸本)9783031209734;9783031209741
Comparisons or Inequality Tests are an essential building block of Rectified Linear Unit functions (relu's), ever more present in Machine Learning, specifically in Neural Networks. Motivated by the increasing interest in privacy-preserving Artificial Intelligence, we explore the current state of the art of privacy preserving comparisons over MultiParty Computation (MPC). We then introduce constant round variations and combinations, which are compatible with customary fixed point arithmetic over MPC. Our main focus is implementation and benchmarking;hence, we showcase our contributions via an open source library, compatible with current MPC software tools. Furthermore, we include a comprehensive comparative analysis on various adversarial settings. Our results improve running times in practical scenarios. Finally, we offer conclusions about the viability of these protocols when adopted for privacy-preserving Machine Learning.
暂无评论