检索结果-内蒙古大学图书馆

8th International Symposium on Neural Networks

作者： Wang, Ding Liu, Derong Chinese Acad Sci Inst Automat Beijing 100190 Peoples R China

ISBN: (纸本)9783642210891

Using the neural-network-based iterative adaptive dynamic programming (ADP) algorithm, an optimal control scheme for a class of unknown discrete-time nonlinear systems with discount factor in the cost function is proposed in this paper. The optimal controller is designed with convergence analysis in terms of cost function and control law. In order to implement the algorithm via globalized dual heuristic programming (CDHP) technique, a neural network is constructed first to identify the unknown nonlinear system, and then two other neural networks are used to approximate the cost function and the control law, respectively. An example is provided to verify the effectiveness of the present approach.

关键词： Adaptive critic designs adaptive dynamic programming approximate dynamic programming intelligent control neural networks optimal control reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Robust Nonlinear Model Predictive Control via approximate Value Function

Robust Nonlinear Model Predictive Control via Approximate Va...

引用

11th International Conference on Control, Automation and Systems (ICCAS)/Robot World Conference

作者： Yang, Yu Lee, Jong Min Univ Alberta Dept Chem & Mat Engn Edmonton AB Canada Seoul Natl Univ Sch Chem & Biol Engn Seoul South Korea

ISBN: (纸本)9788993215038

In order to improve the performance of nonlinear model predictive control (NMPC) in the presence of disturbances or model uncertainties, an approximate dynamic programming (ADP) control scheme is proposed. Namely, the Bellman's optimality principle is employed to determine the input based on the approximate value function constructed from the historical operation data. In addition, the support vector data description is also applied in the state space to determine if the ADP control is suitable for the current state. The proposed control strategy is illustrated on a CSTR example to show its effectiveness.

关键词： approximate dynamic programming model predictive control robust control Lyapunov function support vector data description

来源：评论

学校读者我要写书评

暂无评论

approximate dynamic programming based approach to process control and scheduling

引用

COMPUTERS & CHEMICAL ENGINEERING 2006年第10-12期30卷 1603-1618页

作者： Lee, Jay H. Lee, Jong Min Georgia Inst Technol Sch Chem & Biomol Engn Atlanta GA 30332 USA Univ Virginia Dept Biomed Engn Charlottesville VA 22908 USA

Multi-stage decision problems under uncertainty are abundant in process industries. Markov decision process (MDP) is a general mathematical formulation of such problems. Whereas stochastic programming and dynamic programming are the standard methods to solve MDPs, their unwieldy computational requirements limit their usefulness in real applications. approximate dynamic programming (ADP) combines simulation and function approximation to alleviate the 'curse-of-dimensionality' associated with the traditional dynamic programming approach. In this paper, we present the ADP as a viable way to solve MDPs for process control and scheduling problems. We bring forth some key issues for its successful application in these types of problems, including the choice of function approximator and the use of a penalty function to guard against over-extending the value function approximation in the value iteration. Application studies involving a number of well-known control and scheduling problems, including dual control, multiple controller scheduling, and resource constrained project scheduling problems, point to the promising potentials of ADP. (c) 2006 Elsevier Ltd. All rights reserved.

关键词： Markov decision process approximate dynamic programming function approximation

来源：评论

学校读者我要写书评

暂无评论

approximate dynamic programming methods for an inventory allocation problem under uncertainty

引用

NAVAL RESEARCH LOGISTICS 2006年第8期53卷 822-841页

作者： Topaloglu, Huseyin Kunnumkal, Sumit Cornell Univ Sch Operat Res & Ind Engn Ithaca NY 14853 USA

We propose two approximate dynamic programming methods to optimize the distribution operations of a company manufacturing a certain product at multiple production plants and shipping it to different customer locations for sale. We begin by formulating the problem as a dynamic program. Our first approximate dynamic programming method uses a linear approximation of the value function and computes the parameters of this approximation by using the linear programming representation of the dynamic program. Our second method relaxes the constraints that link the decisions for different production plants. Consequently, the dynamic program decomposes by the production plants. Computational experiments show that the proposed methods are computationally attractive, and in particular, the second method performs significantly better than standard benchmarks. (C) 2006 Wiley Periodicals, Inc.

关键词： inventory dynamic programming approximate dynamic programming

来源：评论

学校读者我要写书评

暂无评论

dynamic policy programming

The Journal of Machine Learning Research

引用

The Journal of Machine Learning Research 2012年第1期13卷

作者： Mohammad Gheshlaghi Azar Vicenç Gómez Hilbert J. Kappen Department of Biophysics Radboud University Nijmegen Nijmegen The Netherlands

In this paper, we propose a novel policy iteration method, called dynamic policy programming (DPP), to estimate the optimal policy in the infinite-horizon Markov decision processes. DPP is an incremental algorithm that forces a gradual change in policy update. This allows us to prove finite-iteration and asymptotic l∞-norm performance-loss bounds in the presence of approximation/ estimation error which depend on the average accumulated error as opposed to the standard bounds which are expressed in terms of the supremum of the errors. The dependency on the average error is important in problems with limited number of samples per iteration, for which the average of the errors can be significantly smaller in size than the supremum of the errors. Based on these theoretical results, we prove that a sampling-based variant of DPP (DPP-RL) asymptotically converges to the optimal policy. Finally, we illustrate numerically the applicability of these results on some benchmark problems and compare the performance of the approximate variants of DPP with some existing reinforcement learning (RL) methods.

关键词： Markov decision processes Monte-Carlo methods approximate dynamic programming function approximation reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

The Effect of Robust Decisions on the Cost of Uncertainty in Military Airlift Operations

引用

ACM TRANSACTIONS ON MODELING AND COMPUTER SIMULATION 2011年第1期22卷 1-1页

作者： Powell, Warren B. Bouzaiene-Ayari, Belgacem Berger, Jean Boukhtouta, Abdeslem George, Abraham P. Princeton Univ Dept Operat Res & Financial Engn Princeton NJ 08544 USA DRDC Valcartier Quebec City ON Canada

There are a number of sources of randomness that arise in military airlift operations. However, the cost of uncertainty can be difficult to estimate, and is easy to overestimate if we use simplistic decision rules. Using data from Canadian military airlift operations, we study the effect of uncertainty in customer demands as well as aircraft failures, on the overall cost. The system is first analyzed using the types of myopic decision rules widely used in the research literature. The performance of the myopic policy is then compared to the results obtained using robust decisions that account for the uncertainty of future events. These are obtained by modeling the problem as a dynamic program, and solving Bellman's equations using approximate dynamic programming. The experiments show that even approximate solutions to Bellman's equations produce decisions that reduce the cost of uncertainty.

关键词： approximate dynamic programming robust control military logistics

来源：评论

学校读者我要写书评

暂无评论

Solving the Curse of Dimensionality tilizing Action-Dependent Heuristic dynamic programming

Solving the Curse of Dimensionality tilizing Action-Dependen...

引用

2011 IEEE International Conference on Computer Science and Automation Engineering(CSAE 2011)

作者： Zhi-jian Huang,Jie Ma Ocean Engineering State Key Laboratory Shanghai Jiao Tong University Shanghai,China Merchant Marine College Shanghai Maritime University Shanghai,China

dynamic programming is an effective optimal control method for multi-stage decision-making ***,it can't be used to solve complex issues due to the problem of curse of *** analyzing the problem of dynamic programming,this article elaborates the theory and method of approximate dynamic programming solving this problem in *** second-order training algorithm is also given to improve the convergence performance of iteration and stability performance of ***,this method was applied in the speed fluctuation control at idle for a four-cylinder diesel engine to verify its correctness and *** illustrated for engine,this control system framework should also be applicable to general purpose nonlinear system,and it doesn't need the model of the controlled object.

关键词： approximate dynamic programming Adaptive critic designs engine fluctuation control at idle neural network

来源：评论

学校读者我要写书评

暂无评论

A cost-shaping linear program for average-cost approximate dynamic programming with performance guarantees

引用

MATHEMATICS OF OPERATIONS RESEARCH 2006年第3期31卷 597-620页

作者： de Farias, Daniela Pucci Van Roy, Benjamin MIT Dept Mech Engn Cambridge MA 02139 USA Stanford Univ Dept Management Sci & Engn Stanford CA 94305 USA Stanford Univ Dept Elect Engn Stanford CA 94305 USA

We introduce a new algorithm based on linear programming for optimization of average-cost Markov decision processes (MDPs). The algorithm approximates the differential cost function of a perturbed MDP via a linear combination of basis functions. We establish a bound on the performance of the resulting policy that scales gracefully with the number of states without imposing the strong Lyapunov condition required by its counterpart in de Farias and Van Roy (de Farias, D. R, B. Van Roy. 2003. The linear programming approach to approximate dynamic programming. Oper Res. 51(6) 850-865]. We investigate implications of this result in the context of a queueing control problem.

关键词： approximate dynamic programming linear programming average cost

来源：评论

学校读者我要写书评

暂无评论

Adaptive dynamic programming for Finite-Horizon Optimal Tracking Control of a Class of Nonlinear Systems

Adaptive Dynamic Programming for Finite-Horizon Optimal Trac...

引用

第三十届中国控制会议

作者： WANG Ding,LIU Derong,WEI Qinglai Key Laboratory of Complex Systems and Intelligence Science,Institute of Automation,Chinese Academy of Sciences, Beijing 100190,P.R.China

This paper deals with the flnite-horizon optimal tracking control for a class of discrete-time nonlinear systems using the iterative adaptive dynamic programming(ADP) ***,the optimal tracking problem is converted into designing a flnite-horizon optimal regulator for the tracking error ***,with convergence analysis in terms of cost function and control law,the iterative ADP algorithm via heuristic dynamic programming(HDP) technique is introduced to obtain the flnite-horizon optimal tracking controller which makes the cost function close to its optimal value within an e-error ***, three neural networks are used to implement the algorithm,which aims at approximating the cost function,the control law,and the error dynamics,*** last,an example is included to demonstrate the effectiveness of the proposed approach.

关键词： Adaptive critic designs Adaptive dynamic programming approximate dynamic programming Finite-horizon optimal tracking control Learning control Neural control

来源：评论

学校读者我要写书评

暂无评论

Choice of approximator and design of penalty function for an approximate dynamic programming based control approach

引用

JOURNAL OF PROCESS CONTROL 2006年第2期16卷 135-156页

作者： Lee, JM Kaisare, NS Lee, JH Georgia Inst Technol Sch Chem & Biomol Engn Atlanta GA 30332 USA

This paper investigates the choice of function approximator for an approximate dynamic programming (ADP) based control strategy. The ADP strategy allows the user to derive an improved control policy given a simulation model and some starting control policy (or alternatively, closed-loop identification data), while circumventing the 'curse-of-dimensionality' of the traditional dynamic programming approach. In ADP, one fits a function approximator to state vs. 'cost-to-go' data and solves the Bellman equation with the approximator in an iterative manner. A proper choice and design of function approximator is critical for convergence of the iteration and the quality of final learned control policy, because an approximation error can grow quickly in the loop of optimization and function approximation. Typical classes of approximators used in related approaches are parameterized global approximators (e.g. artificial neural networks) and nonparametric local averagers (e.g. k-nearest neighbor). In this paper, we assert on the basis of some case studies and a theoretical result that a certain type of local averagers should be preferred over global approximators as the former ensures monotonic convergence of the iteration. However, a converged cost-to-go function does not necessarily lead to a stable control policy on-line due to the problem of over-extrapolation. To cope with this difficulty, we propose that a penalty term be included in the objective function in each minimization to discourage the optimizer from finding a solution in the regions of state space where the local data density is inadequately low. A nonparametric density estimator, which can be naturally combined with a local averager, is employed for this purpose. (c) 2005 Elsevier Ltd. All rights reserved.

关键词： approximate dynamic programming k-nearest neighbor neural network

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：