检索结果-内蒙古大学图书馆

Saddle point approximation approaches for two-stage robust optimization problems

JOURNAL OF GLOBAL OPTIMIZATION 2020年第4期78卷 651-670页

作者： Zhang, Ning Fang, Chang Dongguan Univ Technol Sch Comp Sci & Technol Dongguan 523808 Guangdong Peoples R China Anhui Normal Univ Sch Econ & Management 189 Jiuhua South Rd Wuhu City 241002 Anhui Peoples R China

This paper aims to present improvable and computable approximation approaches for solving the two-stage robust optimization problem, which arises from various applications such as optimal energy management and production planning. Based on sampling finite number scenarios of uncertainty, we can obtain a lower bound approximation and show that the corresponding solution is at least -level feasible. Moreover, piecewise linear decision rules (PLDRs) are also introduced to improve the upper bound that obtained by the widely-used linear decision rule. Furthermore, we show that both the lower bound and upper bound approximation problems can be reformulated into solvable saddle point problems and consequently be solved by the mirror descent method.

关键词： Two-stage robust optimization Randomized approach Piecewise linear decision rule Saddle point problem mirror descent algorithm

来源：评论

学校读者我要写书评

暂无评论

Distributed constrained optimization with periodic dynamic quantization

引用

AUTOMATICA 2024年 159卷

作者： Liu, Jie Li, Lulu Ho, Daniel W. C. City Univ Hong Kong Dept Math Hong Kong Peoples R China Hefei Univ Technol Sch Math Hefei 230009 Peoples R China

This paper uses the mirror descent algorithm with periodic dynamic quantization to solve constrained distributed optimization problems with limited communication channels. Due to the imperfect network environment, obtaining accurate information is impractical, and thus a communication scheme under quantization needs to be considered. A periodic dynamic quantizer with finite quantization levels is proposed in this paper to achieve exact optimization. Moreover, a time -varying control parameter in the mirror descent algorithm is designed to control the quantization error. After a comprehensive analysis, the proposed algorithm can obtain an optimal value, and the optimal convergence rate is O(1/T0.25). (c) 2023 Elsevier Ltd. All rights reserved.

关键词： Distributed optimization mirror descent algorithm Periodic dynamic quantization

来源：评论

学校读者我要写书评

暂无评论

ROBUST STOCHASTIC APPROXIMATION APPROACH TO STOCHASTIC PROGRAMMING

引用

SIAM JOURNAL ON OPTIMIZATION 2009年第4期19卷 1574-1609页

作者： Nemirovski, A. Juditsky, A. Lan, G. Shapiro, A. Georgia Inst Technol Atlanta GA 30332 USA Univ Grenoble 1 F-38041 Grenoble 9 France

In this paper we consider optimization problems where the objective function is given in a form of the expectation. A basic difficulty of solving such stochastic optimization problems is that the involved multidimensional integrals (expectations) cannot be computed with high accuracy. The aim of this paper is to compare two computational approaches based on Monte Carlo sampling techniques, namely, the stochastic approximation (SA) and the sample average approximation (SAA) methods. Both approaches, the SA and SAA methods, have a long history. Current opinion is that the SAA method can efficiently use a specific (say, linear) structure of the considered problem, while the SA approach is a crude subgradient method, which often performs poorly in practice. We intend to demonstrate that a properly modified SA approach can be competitive and even significantly outperform the SAA method for a certain class of convex stochastic problems. We extend the analysis to the case of convex-concave stochastic saddle point problems and present (in our opinion highly encouraging) results of numerical experiments.

关键词： stochastic approximation sample average approximation method stochastic programming Monte Carlo sampling complexity saddle point minimax problems mirror descent algorithm

来源：评论

学校读者我要写书评

暂无评论

Quantizer-based distributed mirror descent for multi-agent convex optimization

Quantizer-based distributed mirror descent for multi-agent c...

引用

第33届中国控制与决策会议

作者： Menghui Xiong Baoyong Zhang Deming Yuan School of Automation Nanjing University of Science and Technology

This paper is concerned with the constrained distributed multi-agent convex optimization problem over a timevarying *** assume that the bit rate of the considered communication is limited,such that a uniform quantizer is applied in the process of exchanging information over the multi-agent *** a quantizer-based distributed mirror descent（QDMD） algorithm,which utilizes the Bregman divergence as the distance-measuring function,is developed for such optimization *** convergence result of the developed algorithm is also *** choosing the iteration step-size ■ and quantization interval v_t/t=λ/t with a prescribed parameter A,it is shown that the convergence rate of the QDMD algorithm can achieve ■,where T is the number of iterations.

关键词： mirror descent algorithm distributed optimization uniform quantizer multi-agent system

来源：评论

学校读者我要写书评

暂无评论

Convergence Rates of Finite Difference Stochastic Approximation algorithms Part I: General Sampling

Convergence Rates of Finite Difference Stochastic Approximat...

引用

Conference on Sensing and Analysis Technologies for Biomedical and Cognitive Applications

作者： Dai, Liyi US Army Res Off Div Comp Sci Res Triangle Pk NC 27703 USA

ISBN: (纸本)9781510601123

Stochastic optimization is a fundamental problem that finds applications in many areas including biological and cognitive sciences. The classical stochastic approximation algorithm for iterative stochastic optimization requires gradient information of the sample object function that is typically difficult to obtain in practice. Recently there has been renewed interests in derivative free approaches to stochastic optimization. In this paper, we examine the rates of convergence for the Kiefer-Wolfowitz algorithm and the mirror descent algorithm, under various updating schemes using finite differences as gradient approximations. The analysis is carried out under a general framework covering a wide range of updating scenarios. It is shown that the convergence of these algorithms can be accelerated by controlling the implementation of the finite differences.

关键词： stochastic approximation Kiefer-Wolfowitz algorithm mirror descent algorithm finite-difference approximation Monte Carlo methods

来源：评论

学校读者我要写书评

暂无评论

Evaluation of Safe Reinforcement Learning with Comirror algorithm in a Non-Markovian Reward Problem 17th

Evaluation of Safe Reinforcement Learning with CoMirror Algo...

引用

17th International Conference on Intelligent Autonomous Systems (IAS)

作者： Miyashita, Megumi Yano, Shiro Kondo, Toshiyuki Tokyo Univ Agr & Technol 2-24-16 Naka Cho Koganei Tokyo Japan

ISBN: (纸本)9783031222153;9783031222160

In reinforcement learning, an agent in an environment improves the skill depending on a reward, which is the feedback from an environment. For practical, reinforcement learning has several important challenges. First, reinforcement learning algorithms often use assumptions for an environment such as Markov decision processes;however, the environment in the real world often cannot be represented by these assumptions. Especially we focus on the environment with nonMarkovian rewards, which allows the reward to depend on past experiences. To handle non-Markovian rewards, researchers have used a reward machine, which decomposes the original task into the sub-tasks. In those works, they assume that the sub-tasks are usually represented by a Markov decision process. Second, safety is also one of the challenges in reinforcement learning. G-CoMDS is a safe reinforcement learning algorithm based on Comirror algorithm, an algorithm for constrained optimization problems. We have developed G-CoMDS algorithm to learn safely under environments without a Markov decision process. Therefore, the promising approach in complex situations would be decomposing the original task as the reward machine does, then solving the sub-tasks with G-CoMDS. In this paper, we provide additional experimental results and discussions of G-CoMDS, as a preliminary step of combining G-CoMDS with a reward machine. We evaluate G-CoMDS and existing reinforcement learning algorithm in the mobile robot simulation with a kind of non-Markovian rewards. The experimental result shows that G-CoMDS has the effect of suppressing the cost spike and slightly exceeds the performance of the existing safe reinforcement learning algorithm.

关键词： Reinforcement learning Constrained optimization mirror descent algorithm

来源：评论

学校读者我要写书评

暂无评论

Event-triggered distributed online convex optimization with delayed bandit feedback

引用

APPLIED MATHEMATICS AND COMPUTATION 2023年第1期445卷

作者： Xiong, Menghui Zhang, Baoyong Yuan, Deming Zhang, Yijun Chen, Jun Nanjing Univ Sci & Technol Sch Automat Nanjing 210094 Jiangsu Peoples R China Jiangsu Normal Univ Sch Elect Engn & Automat Xuzhou 221116 Jiangsu Peoples R China

This paper is concerned with an online distributed convex-constrained optimization prob-lem over a multi-agent network, where the limited network bandwidth and potential feed-back delay caused by network communication are considered. To cope with the limited network bandwidth, an event-triggered communication scheme is introduced in informa-tion exchange. Then, based on the delayed (i.e., single-point and two-point) bandit feed-back, two event-triggered distributed online convex optimization algorithms are developed by utilizing the Bregman divergence in the projection step. Meanwhile, the convergence of the two developed algorithms is analyzed according to the provided static regret bounds achieved by the algorithm. The obtained results show that a sublinear static regret with respect to the time horizon T can be ensured if the triggering threshold gradually ap-proaches zero. In this case, the corresponding order of the regret bounds is also deter-mined by choosing suitable triggering thresholds. Finally, a distributed online regularized linear regression problem is provided as an example to illustrate the effectiveness of the proposed two algorithms. (c) 2023 Elsevier Inc. All rights reserved.

关键词： Distributed online convex optimization mirror descent algorithm Event-triggered communication Delayed bandit feedback

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：