检索结果-内蒙古大学图书馆

An approximate dynamic programming based approach to dual adaptive control

JOURNAL OF PROCESS CONTROL 2009年第5期19卷 859-864页

作者： Lee, Jong Min Lee, Jay H. Georgia Inst Technol Sch Chem & Biomol Engn Atlanta GA 30332 USA Univ Alberta Edmonton AB T6G 2G6 Canada

In this paper, an approximate dynamic programming (ADP) based strategy is applied to the dual adaptive control problem. The ADP strategy provides a computationally amenable way to build a significantly improved policy by solving dynamic programming on only those points of the hyper-state space sampled during closed-loop Monte Carlo simulations performed under known suboptimal control policies. The potentials of the ADP approach for generating a significantly improved policy are illustrated on an ARX process with unknown/varying parameters. (C) 2009 Elsevier Ltd. All rights reserved.

关键词： Dual control Stochastic optimal control Adaptive control Stochastic dynamic programming approximate dynamic programming

来源：评论

学校读者我要写书评

暂无评论

An approximate dynamic programming approach for the vehicle routing problem with stochastic demands

引用

EUROPEAN JOURNAL OF OPERATIONAL RESEARCH 2009年第2期196卷 509-515页

作者： Novoa, Clara Storer, Robert SW Texas State Univ Ingram Sch Engn Ind Engn Program San Marcos TX 78666 USA Lehigh Univ Dept Ind & Syst Engn Bethlehem PA 18015 USA

This paper examines approximate dynamic programming algorithms for the single-vehicle routing problem with stochastic demands from a dynamic or reoptimization perspective. The methods extend the rollout algorithm by implementing different base sequences (i.e. a priori solutions), look-ahead policies, and pruning schemes. The paper also considers computing the cost-to-go with Monte Carlo simulation in addition to direct approaches. The best new method found is a two-step lookahead rollout started with a stochastic base sequence. The routing cost is about 4.8% less than the one-step rollout algorithm started with a deterministic sequence. Results also show that Monte Carlo cost-to-go estimation reduces computation time 65% in large instances with little or no loss in solution quality. Moreover, the paper compares results to the perfect information case from solving exact a posteriori solutions for sampled vehicle routing problems. The confidence interval for the overall mean difference is (3.56%, 4.11%). (C) 2008 Elsevier B.V. All rights reserved.

关键词： Transportation Stochastic vehicle routing approximate dynamic programming

来源：评论

学校读者我要写书评

暂无评论

An Optimal approximate dynamic programming Algorithm for the Lagged Asset Acquisition Problem

引用

MATHEMATICS OF OPERATIONS RESEARCH 2009年第1期34卷 210-237页

作者： Nascimento, Juliana M. Powell, Warren B. Princeton Univ Dept Operat Res & Financial Engn Princeton NJ 08544 USA

We consider a multistage asset acquisition problem where assets are purchased now, at a price that varies randomly over time, to be used to satisfy a random demand at a particular point in time in the future. We provide a rare proof of convergence for an approximate dynamic programming algorithm using pure exploitation, where the states we visit depend on the decisions produced by solving the approximate problem. The resulting algorithm does not require knowing the probability distribution of prices or demands, nor does it require any assumptions about its functional form. The algorithm and its proof rely on the fact that the true value function is a family of piecewise linear concave functions.

关键词： stochastic learning and adaptive control stochastic approximation approximate dynamic programming

来源：评论

学校读者我要写书评

暂无评论

Infinite Horizon Self-Learning Optimal Control of Nonaffine Discrete-Time Nonlinear Systems

引用

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2015年第4期26卷 866-879页

作者： Wei, Qinglai Liu, Derong Yang, Xiong Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China

In this paper, a novel iterative adaptive dynamic programming (ADP)-based infinite horizon self-learning optimal control algorithm, called generalized policy iteration algorithm, is developed for nonaffine discrete-time (DT) nonlinear systems. Generalized policy iteration algorithm is a general idea of interacting policy and value iteration algorithms of ADP. The developed generalized policy iteration algorithm permits an arbitrary positive semidefinite function to initialize the algorithm, where two iteration indices are used for policy improvement and policy evaluation, respectively. It is the first time that the convergence, admissibility, and optimality properties of the generalized policy iteration algorithm for DT nonlinear systems are analyzed. Neural networks are used to implement the developed algorithm. Finally, numerical examples are presented to illustrate the performance of the developed algorithm.

关键词： Adaptive critic designs adaptive dynamic programming (ADP) approximate dynamic programming generalized policy iteration neural networks (NNs) neurodynamic programming nonlinear systems optimal control reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

An approximate dynamic programming approach to solving a dynamic, stochastic multiple knapsack problem

引用

INTERNATIONAL TRANSACTIONS IN OPERATIONAL RESEARCH 2009年第3期16卷 347-359页

作者： Perry, Thomas C. Hartman, Joseph C. LSI Log Allentown PA USA Univ Florida Gainesville FL USA

We model a multiperiod, single resource capacity reservation problem as a dynamic, stochastic, multiple knapsack problem with stochastic dynamic programming. As the state space grows exponentially in the number of knapsacks and the decision set grows exponentially in the number of order arrivals per period, the recursion is computationally intractable for large-scale problems, including those with long horizons. Our goal is to ensure optimal, or near optimal, decisions at time zero when maximizing the net present value of returns from accepted orders, but solving problems with short horizons introduces end-of-study effects which may prohibit finding good solutions at time zero. Thus, we propose an approximation approach which utilizes simulation and deterministic dynamic programming in order to allow for the solution of longer horizon problems and ensure good time zero decisions. Our computational results illustrate the effectiveness of the approximation scheme.

关键词： stochastic dynamic programming approximate dynamic programming simulation

来源：评论

学校读者我要写书评

暂无评论

A Survey of approximate dynamic programming

A Survey of Approximate Dynamic Programming

引用

International Conference on Intelligent Human-Machine Systems and Cybernetics

作者： Wang Lin Peng Hui Zhu Hua-yong Shen Lin-cheng Natl Univ Def Technol Coll Mechatron Engn & Automat Changsha Hunan Peoples R China

ISBN: (纸本)9780769537528

Multi-stage decision problems under uncertainty are abundant in process industries. Markov decision process (MDP) is a general mathematical formulation of such problems. Whereas stochastic programming and dynamic programming are the standard methods to solve MDPs, their unwieldy computational requirements limit their usefulness in real applications. approximate dynamic programming (ADP) combines simulation and function approximation to alleviate the "curse-of-dimensionality" associated with the traditional dynamic programming approach. In this paper, the method of ADP, which abates the curse-of-dimensionality by solving the DP within a carefully chosen, small subset of the state space, was introduced;a survey of recent research directions within the field of ADP had been made.

关键词： dynamic programming approximate dynamic programming Reinforcement Learning Markov Decision Processes

来源：评论

学校读者我要写书评

暂无评论

Design Optimization of Direct Heuristic dynamic programming Based on Hybrid Estimation of Distribution Algorithm

Design Optimization of Direct Heuristic Dynamic Programming ...

引用

2015全国理论计算机科学学术年会

作者： Xiong LUO Mi ZHOU Yixuan LV School of Computer and Communication Engineering University of Science and Technology BeijingBeijing 100083China Beijing Key Laboratory of Knowledge Engineering for Materials Science Beijing 100083China

As an important class of approximate dynamic programming, the direct heuristic dynamic programming (DHDP) is discussed in this *** performs well due to its model-free online learning *** the classical DHDP is implemented with gradient-based adaptation learning algorithm of neural network, in this paper we present a design strategy of DHDP with a novel hybrid estimation of distribution algorithm for online learning and control, and the proposed design optimization method achieves the weight training of neural networks with faster convergence *** proposed approach can be viewed as an improvement for *** simulation is conducted on a practical system plant to test the online learning performance by using our ***, the simulation results show the effectiveness of our approach.

关键词： approximate dynamic programming Direct Heuristic dynamic programming Estimation of Distribution Algorithm Online Learning

来源：评论

学校读者我要写书评

暂无评论

Model-Free approximate dynamic programming for Continuous-Time Linear Systems

Model-Free Approximate Dynamic Programming for Continuous-Ti...

引用

Joint 48th IEEE Conference on Decision and Control (CDC) / 28th Chinese Control Conference (CCC)

作者： Lee, Jae Young Park, Jin Bae Choi, Yoon Ho Yonsei Univ Dept Elect & Elect Engn Seoul 120749 South Korea Kyonggi Univ Sch Elect Engn Kyonggi 443760 South Korea

ISBN: (纸本)9781424438723

In this paper, a novel online approximate dynamic programming (ADP) technique for completely unknown continuous-time linear systems is proposed to solve the infinite horizon linear quadratic (LQ) optimal control problems. For relaxing the assumption of the known input coupling matrix, the conventional LQ optimal control problem is converted into the proposed cheap control problem. Then, the ADP agent iteratively solves this cheap optimal control problem in online fashion to obtain the near-optimal solution of the conventional LQ optimal control problem. In addition, we mathematically prove the approximation property of the cheap optimal control problem with respect to the conventional LQ optimal control problem. The numerical simulation for ideal DC motor shows the applicability of the proposed ADP algorithm.

关键词： approximate dynamic programming LQR adaptive critics adaptive optimal control

来源：评论

学校读者我要写书评

暂无评论

Scalable approximate dynamic programming models with applications in air transportation

Scalable approximate dynamic programming models with applica...

引用

作者： Balakrishna, Poornima George Mason University

学位级别：Ph.D.

The research objective of the dissertation is to develop methods to address the curse of dimensionality in the field of approximate dynamic programming, to enhance the scalability of these methods to large-scale problems. Several problems, including those faced in day to day life involve sequential decision making in the presence of uncertainty. These problems can often be modeled as Markov decision processes using the Bellman's optimality equation. Attempts to solve even reasonably complex problems through stochastic dynamic programming are faced with the curse of modeling and the curse of dimensionality. The curse of modeling has been addressed in the literature through the introduction of reinforcement learning strategies, a strand of approximate dynamic programming (ADP). In spite of considerable research efforts, curse of dimensionality which affects the scalability of ADP for large scale applications still remains a challenge. In this research, a value function approximation method based on the theory of diffusion wavelets is investigated to address the scalability of ADP methods. The first contribution of this dissertation is an advancement of the state-of-the-art in the field of stochastic dynamic programming methods that are solved using ADP approaches. An important intellectual merit is the innovatively designed diffusion wavelet based value function approximation method which is integrated with ADP to address the curse of dimensionality. The innovation lies in this integration that exploits the structure of the problem to achieve computational feasibility. The ADP method with diffusion wavelet based value function approximation is tested on the problem of taxi-out time estimation of aircrafts (time duration between gate-pushback and wheels-off) to establish a proof of concept for the research objective. The second contribution of this dissertation is the modeling of the taxi-out time estimation of flights as a stochastic dynamic programming problem with t

关键词： approximate dynamic programming scalability taxi-out time air transportation dimensionality

来源：评论

学校读者我要写书评

暂无评论

Neural-network-based approach to finite-time optimal control for a class of unknown nonlinear systems

引用

SOFT COMPUTING 2014年第8期18卷 1645-1653页

作者： Song, Ruizhuo Xiao, Wendong Wei, Qinglai Sun, Changyin Univ Sci & Technol Beijing Sch Automat & Elect Engn Beijing 100083 Peoples R China Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China

This paper proposes a novel finite-time optimal control method based on input-output data for unknown nonlinear systems using adaptive dynamic programming (ADP) algorithm. In this method, the single-hidden layer feed-forward network (SLFN) with extreme learning machine (ELM) is used to construct the data-based identifier of the unknown system dynamics. Based on the data-based identifier, the finite-time optimal control method is established by ADP algorithm. Two other SLFNs with ELM are used in ADP method to facilitate the implementation of the iterative algorithm, which aim to approximate the performance index function and the optimal control law at each iteration, respectively. A simulation example is provided to demonstrate the effectiveness of the proposed control scheme.

关键词： Adaptive dynamic programming approximate dynamic programming Unknown nonlinear systems Optimal control Data-based

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：