检索结果-内蒙古大学图书馆

approximate dynamic programming strategies and their applicability for process control: A review and future directions

引用

INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS 2004年第3期2卷 263-278页

作者： Lee, JM Lee, JH Georgia Inst Technol Sch Chem & Biomol Engn Atlanta GA 30332 USA

This paper reviews dynamic programming (DP), surveys approximate solution methods for it, and considers their applicability to process control problems. Reinforcement Learning (RL) and Neuro-dynamic programming (NDP), which can be viewed as approximate DP techniques, are already established techniques for solving difficult multi-stage decision problems in the fields of operations research, computer science, and robotics. Owing to the significant disparity of problem formulations and objective, however, the algorithms and techniques available from these fields are not directly applicable to process control problems, and reformulations based on accurate understanding of these techniques are needed. We categorize the currently available approximate solution techniques for dynamic programming and identify those most suitable for process control problems. Several open issues are also identified and discussed.

关键词： approximate dynamic programming reinforcement learning neuro-dynamic programming optimal control function approximation

来源：评论

学校读者我要写书评

暂无评论

approximate dynamic programming for liner shipping network design

引用

ANNALS OF OPERATIONS RESEARCH 2024年 1-40页

作者： Lee, Sangmin Boomsma, Trine Krogh Holst, Klaus Kahler Maersk Data Sci Esplanaden 50 DK-1263 Copenhagen Denmark Univ Copenhagen Dept Math Sci Univ Pk 5 Copenhagen 2100 Denmark Novo Nord AS Novo Alle 1 DK-2880 Bagsvaerd Denmark

The global containerised trade heavily relies on liner shipping services, facilitating the worldwide movement of large cargo volumes along fixed routes and schedules. The profitability of shipping companies hinges on how efficiently they design their shipping network;a complex optimization problem known as the liner shipping network design problem (LSNDP). In recent years, approximate dynamic programming (ADP), also known as reinforcement learning, has emerged as a promising approach for large-scale optimisation. This paper introduces a novel Markov decision process for the LSNDP and investigates the potential of ADP. We show that ADP methods based on value iteration produce optimal solutions to small instances, but their scalability is hindered by high memory demands. An ADP method based on a deep neural network requires less memory and successfully obtains feasible solutions. The quality of solutions, however, declines for larger instances, possibly due to the discrete nature of high-dimensional state and action spaces.

关键词： Liner shipping network design approximate dynamic programming Deep reinforcement learning Combinatorial optimisation

来源：评论

学校读者我要写书评

暂无评论

approximate dynamic programming via Penalty Functions

引用

IFAC-PapersOnLine 2017年第1期50卷 11814-11821页

作者： Beuchat, Paul N. Lygeros, John Automatic Control Laboratory ETH Zürich ZH8006 Switzerland

In this paper, we propose a novel formulation for encoding state constraints into the Linear programming approach to approximate dynamic programming via the use of penalty functions. To maintain tractability of the resulting optimization problem that needs to be solved, we suggest a penalty function that is constructed as a point-wise maximum taken over a family of low-order polynomials. Once the penalty functions are designed, no additional approximations are introduced by the proposed formulation. The effectiveness and numerical stability of the formulation is demonstrated through examples. © 2017

关键词： dynamic programming Linear programming Optimal control systems Stochastic control systems Stochastic systems approximate dynamic programming Low order polynomials Optimization problems Penalty function Point wise Soft constraint State constraints Stochastic optimal control problem

来源：评论

学校读者我要写书评

暂无评论

approximate dynamic programming solutions with a single network adaptive critic for a class of nonlinear systems

引用

控制理论与应用（英文版） 2011年第3期9卷 370-380页

作者： S.N.BALAKRISHNAN Department of Mechanical and Aerospace Engineering Missouri University of Science and Technology

approximate dynamic programming(ADP) formulation implemented with an adaptive critic(AC)-based neural network(NN) structure has evolved as a powerful technique for solving the Hamilton-Jacobi-Bellman(HJB) *** interest in ADP and the AC solutions are escalating with time,there is a dire need to consider possible enabling factors for their implementations.A typical AC structure consists of two interacting NNs,which is computationally *** this paper,a new architecture,called the ’cost-function-based single network adaptive critic(J-SNAC)’ is presented,which eliminates one of the networks in a typical AC *** approach is applicable to a wide class of nonlinear systems in *** order to demonstrate the benefits and the control synthesis with the J-SNAC,two problems have been solved with the AC and the J-SNAC *** are presented,which show savings of about 50% of the computational costs by J-SNAC while having the same accuracy levels of the dual network structure in solving for optimal ***,convergence of the J-SNAC iterations,which reduces to a least-squares problem,is discussed;for linear systems,the iterative process is shown to reduce to solving the familiar algebraic Ricatti equation.

关键词： approximate dynamic programming Optimal control Nonlinear control Adaptive critic Cost-functionbased single network adaptive critic J-SNAC architecture

来源：评论

学校读者我要写书评

暂无评论

approximate dynamic programming approach for process control

引用

JOURNAL OF PROCESS CONTROL 2010年第9期20卷 1038-1048页

作者： Lee, Jay H. Wong, Weechin Georgia Inst Technol Sch Chem & Biomol Engn Atlanta GA 30332 USA

We assess the potentials of the approximate dynamic programming (ADP) approach for process control, especially as a method to complement the model predictive control (MPC) approach. In the artificial intelligence (AI) and operations research (OR) research communities, ADP has recently seen significant activities as an effective method for solving Markov decision processes (MDPs), which represent a type of multi-stage decision problems under uncertainty. Process control problems are similar to MDPs with the key difference being the continuous state and action spaces as opposed to discrete ones. In addition, unlike in other popular ADP application areas like robotics or games, in process control applications first and foremost concern should be on the safety and economics of the on-going operation rather than on efficient learning. We explore different options within ADP design, such as the pre-decision state vs. post-decision state value function, parametric vs. nonparametric value function approximator, batch-mode vs. continuous-mode learning, and exploration vs. robustness. We argue that ADP possesses great potentials, especially for obtaining effective control policies for stochastic constrained nonlinear or linear systems and continually improving them towards optimality. (C) 2010 Elsevier Ltd. All rights reserved.

关键词： Stochastic process control Stochastic dynamic programming approximate dynamic programming Dual control Constrained control

来源：评论

学校读者我要写书评

暂无评论

approximate dynamic programming-based control of distributed parameter systems

引用

ASIA-PACIFIC JOURNAL OF CHEMICAL ENGINEERING 2011年第3期6卷 452-459页

作者： Joy, Midhun Kaisare, Niket S. Indian Inst Technol Dept Chem Engn Madras 600036 Tamil Nadu India

The objective of this work is to extend the approximate dynamic programming (ADP) framework to online control of distributed parameter systems. The ADP framework involves using suboptimal control policies to identify the relevant regions of the state space and to generate a cost-to-go function approximation applicable in this region. We present model-based value iteration and model-free Q-learning approaches for feedback control of an adiabatic plug flow reactor. The state dimension is reduced using appropriate model reduction and sensor placement techniques. We show that both the approaches provide better performance than the initial model predictive control and Proportional-Integral-Derivative (PID) controllers. Finally, an extension of ADP to the stochastic case with full state feedback is presented. (C) 2011 Curtin University of Technology and John Wiley & Sons, Ltd.

关键词： approximate dynamic programming value iteration Q-learning hyperbolic PDEs k-nearest neighbor approximator

来源：评论

学校读者我要写书评

暂无评论

approximate dynamic programming-based control of a building cooling system with thermal storage

Approximate dynamic programming-based control of a building ...

引用

4th IEEE/PES Innovative Smart Grid Technologies Europe (ISGT EUROPE)

作者： Borghesan, F. Vignali, R. Piroddi, L. Prandirti, M. Strelec, M. Politecn Milan DEIB Milan Italy Honeywell Prague Lab Prague Czech Republic

ISBN: (纸本)9781479929849

The paper addresses the energy management of a building cooling system comprising a chiller plant with two chillers, a thermal storage unit, and a cooling load representing a building. Uncertainty affects the system since the cooling load depends on the building occupancy. The goal is to minimize the energy consumption of the cooling system, while preserving comfort in the building. This is achieved by optimally distributing the cooling load demand among the chillers and the thermal storage unit, and modulating the building temperature set-point to some (limited) extent. The problem can be decomposed into a static optimization problem, and a dynamic programming problem, which is solved based on the abstraction to a Markov chain of the stochastic hybrid system modeling the cooling system.

关键词： approximate dynamic programming cooling systems energy management stochastic hybrid systems Markov chain abstraction

来源：评论

学校读者我要写书评

暂无评论

approximate dynamic programming via Penalty Functions

Approximate Dynamic Programming via Penalty Functions

引用

20th World Congress of the International-Federation-of-Automatic-Control (IFAC)

作者： Beuchat, Paul N. Lygeros, John ETH Automat Control Lab CH-8006 Zh Switzerland

关键词： Stochastic optimal control problems approximate dynamic programming Soft constraints

来源：评论

学校读者我要写书评

暂无评论

approximate dynamic programming Based Controller Design Using an Improved Learning Algorithm with Application to Tracking Control of Aircraft

Approximate Dynamic Programming Based Controller Design Usin...

引用

Chinese Intelligent Automation Conference (CIAC)

作者： Luo, Xiong Zhou, Yuchao Sun, Zengqi USTB Sch Comp & Commun Engn 30 Xueyuan Rd Beijing 100083 Peoples R China Beijing Key Lab Knowledge Engn Mat Sci Beijing 100083 Peoples R China Tsinghua Univ Dept Comp Sci & Technol Beijing 100084 Peoples R China

ISBN: (纸本)9783642384608;9783642384592

The strategy using approximate/adaptive dynamic programming (ADP) has been widely used to design a learning controller for complex systems of higher dimension in recent years. This paper aims at handling an important problem in the design of ADP learning controllers, which is the improvement of learning algorithm for its convergence performance. We analyze ADP controller implementation framework according to the requirement of tracking control task, with emphasis on providing an improved weight-updating gradient descent approach in optimizing connection weights in network structures. A comparison of the proposed method and classic ADP design for tracking and controlling pitch angle of aircraft is presented. It verifies the feasibility in the design of the proposed ADP based controller.

关键词： approximate dynamic programming Controller Learning algorithm Aircraft control

来源：评论

学校读者我要写书评

暂无评论

approximate dynamic programming with Gaussian Processes for Optimal Control of Continuous-Time Nonlinear Systems 21st

Approximate Dynamic Programming with Gaussian Processes for ...

引用

21st IFAC World Congress on Automatic Control - Meeting Societal Challenges

作者： Beppu, Hirofumi Maruta, Ichiro Fujimoto, Kenji Kyoto Univ Grad Sch Engn Dept Aeronaut & Astronaut Kyoto 6158540 Japan

In this paper, a new algorithm for realization of approximate dynamic programming (ADP) with Gaussian processes (GPs) for continuous-time (CT) nonlinear input-affine systems is proposed to infinite horizon optimal control problems. The convergence for the ADP algorithm is proven based on the assumption of an exact approximation, where both the cost function and the control input converge to their optimal values, that is, the solution to the Hamilton-Jacobi-Bellman (HJB) equation. The approximation errors, however, are unavoidable in almost every case of applications. In order to tackle the problem, the proposed algorithm is derived with the proof of convergence, where the cost function and the control input, which are both approximated, converge to those of the ADP as the number of data points for GPs approaches infinity. A numerical simulation demonstrates the effectiveness of the proposed algorithm. Copyright (C) 2020 The Authors.

关键词： approximate dynamic programming heuristic dynamic programming value iteration optimal control Gaussian processes nonparametric models

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：