This paper investigates the distributed convex optimization problem over a multi-agent system with Markovian switching communication *** objective function is the sum of each agent’s local nonsmooth objective functio...
详细信息
This paper investigates the distributed convex optimization problem over a multi-agent system with Markovian switching communication *** objective function is the sum of each agent’s local nonsmooth objective function,which cannot be known by other *** communication network is assumed to switch over a set of weight-balanced directed graphs with a Markovian *** authors propose a consensus sub-gradient algorithm with twotime-scale step-sizes to handle the Markovian switching topologies and the absence of global gradient *** proper selection of step-sizes,the authors prove the almost sure convergence of all agents’local estimates to the same optimal solution when the union graph of the Markovian network’states is strongly connected and the Markovian chain is *** convergence rate analysis is also given for specific *** are given to demonstrate the results.
Greedy-GQ with linear function approximation, originally proposed in Maei et al. (in: Proceedings of the international conference on machine learning (ICML), 2010), is a value-based off-policy algorithm for optimal co...
详细信息
Greedy-GQ with linear function approximation, originally proposed in Maei et al. (in: Proceedings of the international conference on machine learning (ICML), 2010), is a value-based off-policy algorithm for optimal control in reinforcement learning, and it has a non-linear twotimescale structure with non-convex objective function. This paper develops its tightest finite-time error bounds. We show that the Greedy-GQ algorithm converges as fast as O(1/T)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {O}(algorithm/{\sqrt{T}})$$\end{document} under the i.i.d. setting and O(logT/T)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {O}({\log T}/{\sqrt{T}})$$\end{document} under the Markovian setting. We further design variant of the vanilla Greedy-GQ algorithm using the nested-loop approach, and show that its sample complexity is O(log(1/& varepsilon;)& varepsilon;-2)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {O}({\log (1/\epsilon )\epsilon <^>{-2}})$$\end{document}, which matches with the one of the vanilla Greedy-GQ. Our finite-time error bounds match with the one of the stochastic gradient descent algorithm for general smooth non-convex optimization problems, despite of its additonal challenge in the twotime-scale updates. Our finite-sample analysis provides theoretical guidance on choosing step-sizes for faster convergence in practice, and suggests the trade-off between the convergence rate and the quality
The implementation of distributed network utility maximization (NUM) algorithms hinges heavily on information feedback through message passing among network elements. In practical systems the feedback is often obtaine...
详细信息
The implementation of distributed network utility maximization (NUM) algorithms hinges heavily on information feedback through message passing among network elements. In practical systems the feedback is often obtained using error-prone measurement mechanisms and suffers from random errors. In this paper, we investigate the impact of noisy feedback on distributed NUM. We first study the distributed NUM algorithms based on the Lagrangian dual method, and focus on the primal-dual (P-D) algorithm, which is a single time-scalealgorithm in the sense that the primal and dual parameters are updated simultaneously. Assuming strong duality, we study both cases when the stochastic gradients are unbiased or biased, and develop a general theory on the stochastic stability of the P-D algorithms in the presence of noisy feedback. When the gradient estimators are unbiased, we establish, via a combination of tools in Martingale theory and convex analysis, that the iterates generated by distributed P-D algorithms converge with probability one to the optimal point, under standard technical conditions. In contrast, when the gradient estimators are biased, we show that the iterates converge to a contraction region around the optimal point, provided that the biased terms are asymptotically bounded by a scaled version of the true gradients. We also investigate the rate of convergence for the unbiased case, and find that, in general, the limit process of the interpolated process corresponding to the normalized iterate sequence is a stationary reflected linear diffusion process, not necessarily a Gaussian diffusion process. We apply the above general theory to investigate stability of cross-layer rate control for joint congestion control and random access. Next, we study the impact of noisy feedback on distributed twotime-scale NUM algorithms based on primal decomposition. We establish, via the mean ODE method, the convergence of the stochastic two time-scale algorithm under mild conditi
暂无评论