检索结果-内蒙古大学图书馆

Sequential learning based re-optimization approaches for less model-based dynamic pick-up routing problem

INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE-OPERATIONS & LOGISTICS 2024年第1期11卷

作者： Yu, Wu Southwest Jiaotong Univ Sch Transportat & Logist Chengdu 611756 Sichuan Peoples R China Natl United Engn Lab Integrated & Intelligent Tran Chengdu Peoples R China Natl Engn Lab Big Data Applicat Integrated Transpo Chengdu Peoples R China

We address a lessmodel-based dynamic routing problem arising from home parcel pick-up service, where lessmodel-based means existing customers who dynamically request services independently following Poisson process with a stochastic rate. Overall, through an extended application of re-optimization (RO) strategy, a Markov decision process formulation and approximation dynamic programming- and Bayes' theorem-based solution approaches are proposed. Specifically, first a pool of basic policies corresponding to all possible values of the rate are developed offline via approximate value iteration. Then, Bayes' theorem-based sequential learning is designed that can sequentially update the belief about the rate's probability distribution over its possible values. Third, coupled with the updated belief, basic policies are collectively implemented in two different ways, resulting in RO approaches which involve constructions of two different online policies, i.e. a belief-weighted deterministic policy and a belief-based random policy, and their re-optimizations at decision epochs. In the numerical study, through comparison with model-based (i.e. using full knowledge of the rate) and model-free heuristics, our approaches are examined and valuable insights are obtained. Important insights include that (i) the belief-weighted deterministic policy outperforms the belief-based random policy, and further, (ii) the former is better than the latter at preserving the improvement resulting from the improved model-based policy.

关键词： Vehicle routing less model-based approximate dynamic programming Bayes' theorem sequential learning

来源：评论

学校读者我要写书评

暂无评论

Multi-sourcing under supply uncertainty and Buyer's risk aversion

引用

EURO JOURNAL ON COMPUTATIONAL OPTIMIZATION 2021年 9卷

作者： Chintapalli, Prashant Western Univ Ivey Business Sch London ON N6A 3K7 Canada

We address the combined problem of supplier (or vendor) selection and ordering decision when a buyer can choose to procure from multiple suppliers whose yields are uncertain and potentially correlated. We model this problem as a stochastic program with recourse in which the buyer purchases from the suppliers in the first period and, if needed, chooses to purchase from the spot market or from the suppliers with excess supply, whichever is beneficial, in the second period in order to meet the target procurement quantity. We solve the above problem using sample average approximation (SAA) technique that enables us to solve the problem easily in practice. We compare the performance of our solution with the certainty equivalent problem, which is practiced widely and which we use as the benchmark, to evaluate the efficacy of our approach. Next, we extend our model to incorpo-rate buyer's risk aversion with respect to the quantity procured. We reformulate the multi-sourcing problem as a mixed integer linear program (MILP) and adopt a statistical approach to account for buyer's risk aversion. Thus, we design a simple computational technique that provides an optimal sourcing policy from a set of suppliers when each supplier's yield is uncertain with a generic probability distribution.

关键词： Purchasing Supplier/vendor selection Supply/yield uncertainty approximate dynamic programming Stochastic programming with recourse

来源：评论

学校读者我要写书评

暂无评论

Virtual Generators: Simplified Online Power System Representations for Wide-Area Damping Control

Virtual Generators: Simplified Online Power System Represent...

引用

IEEE Power and Energy Society General Meeting

作者： Diogenes Molina Jiaqi Liang Ronald G. Harley Ganesh Kumar Venayagamoorthy Intelligent Power Infrastructure Consortium Department of Electrical and Computer Engineering Georgia Institute of Technology Atlanta GA 30332 USA Holcombe Department of Electrical and Computer Engineering Clemson University Clemson SC 29634 USA

ISBN: (纸本)9781467327275

This paper introduces a new concept called a Virtual Generator (VG). VGs are simplified representations of groups of coherent synchronous generators in a power system. They resemble commonly used power system dynamic equivalents obtained via generator aggregation techniques. Traditionally power system dynamic equivalents are developed offline, fixed, and used to replace large portions of the system that are considered external to the portion of the system being analyzed in detail. In contrast, VGs are calculated online, are not limited to representing external areas of the system being analyzed/controlled, and do not replace any portion of the power system. Instead, they allow wide-area damping controllers (WADCs) to exploit the realization that a group of coherent synchronous generators in a power system can be controlled as a single generating unit for achieving wide-area damping control objectives. The implementation of VGs is made possible by the availability of Wide-Area Measurements (WAMs) from Phasor Measurement Units (PMUs). To the authors' knowledge, this is the first time that the use of power system equivalencing techniques has been extended to real-time WADC. Simulation studies carried out on the 68-bus New England/New York power system demonstrate that intelligent controllers developed using VGs can significantly improve the stability of a power system by effectively damping low-frequency interarea oscillations.

关键词： virtual generator power system stabilizer wide-area control power system equivalents intelligent control approximate dynamic programming adaptive critic designs generator coherency interarea oscillations power systems damped control dynamos intelligent controller Phasor measurement units Power system dynamics Generating sets Synchronous generators representations Power system stability

来源：评论

学校读者我要写书评

暂无评论

A UNIFIED FRAMEWORK FOR LINEAR FUNCTION APPROXIMATION OF VALUE FUNCTIONS IN STOCHASTIC CONTROL

A UNIFIED FRAMEWORK FOR LINEAR FUNCTION APPROXIMATION OF VAL...

引用

European Signal Processing Conference

作者： Matilde Sanchez-Fernandez Sergio Valcarcel Santiago Zazoy Universidad Carlos III de Madrid Signal Theory & Communictions Dept. Universidad Politecnica de Madrid Signals Systems & Radiocommunications Dept. Av. Complutense Universidad Politecnica de Madrid Signals Systems & Radiocommunications Dept. Av. Complutense

This paper contributes with a unified formulation that merges previous analysis on the prediction of the performance (value function) of certain sequence of actions (policy) when an agent operates a Markov decision process with large state-space. When the states are represented by features and the value function is linearly approximated, our analysis reveals a new relationship between two common cost functions used to obtain the optimal approximation. In addition, this analysis allows us to propose an efficient adaptive algorithm that provides an unbiased linear estimate. The performance of the proposed algorithm is illustrated by simulation, showing competitive results when compared with the state-of-the-art solutions.

关键词： approximate dynamic programming Linear value function approximation Mean squared Bellman Error Mean squared projected Bellman Error Reinforcement Learning

来源：评论

学校读者我要写书评

暂无评论

Satisficing vs exploring when learning a constrained environment

Satisficing vs exploring when learning a constrained environ...

引用

International Conference on Soft Computing and Intelligent Systems

作者： Stephen Shervais Thaddeus T. Shannon College of Business and Public Administration Eastern Washington University Systems Science Program Portland State University

ISBN: (纸本)9781467327428

Satisficing is an efficient strategy for applying existing knowledge in a complex, constrained, environment. We present a set of agent-based simulations that demonstrate a higher payoff for satisficing strategies than for exploring strategies when using approximate dynamic programming methods for learning complex environments. In our constrained learning environment, satisficing agents outperformed exploring agent by approximately six percent, in terms of the number of tasks completed.

关键词： Component Satisficing approximate dynamic programming Q learning Agent-based simulation

来源：评论

学校读者我要写书评

暂无评论

Optimal Control of Unknown Discrete-Time Nonlinear Systems with Constrained Inputs Using GDHP Technique

Optimal Control of Unknown Discrete-Time Nonlinear Systems w...

引用

第三十一届中国控制会议

作者： LIU Derong,WANG Ding,LI Hongliang State Key Laboratory of Management and Control for Complex Systems Institute of Automation,Chinese Academy of Sciences, Beijing 100190,P.R.China

The adaptive dynamic programming(ADP) approach is employed to design an optimal controller for unknown discrete-time nonlinear systems with control ***,a neural network is constructed to identify the unknown dynamical system with stability ***,the iterative ADP algorithm is developed to solve the optimal control problem with convergence ***,two other neural networks are introduced to approximate the cost function and its derivative and the control law,under the framework of globalized dual heuristic programming ***,two simulation examples are included to verify the theoretical results.

关键词： Adaptive dynamic programming approximate dynamic programming Control constraints Neural networks Optimal control System identification

来源：评论

学校读者我要写书评

暂无评论

Economic Dispatch of an Integrated Microgrid Based on the dynamic Process of CCGT Plant

Economic Dispatch of an Integrated Microgrid Based on the Dy...

引用

第33届中国控制与决策会议

作者： Zhiyi Lin Chunyue Song Jun Zhao Chao Yang Huan Yin College of Control Science and Engineering Zhejiang University

Intra-day economic dispatch of an integrated microgrid is a fundamental requirement to integrate distributed *** dynamic energy flows in cogeneration units present challenges to the energy management of the *** this paper,a novel approximate dynamic programming(ADP) approach is proposed to solve this problem based on value function approximation,which is distinct with the consideration of the dynamic process constraints of the combined-cycle gas turbine(CCGT) ***,we mathematically formulate the multi-time periods decision problem as a finite-horizon Markov decision *** deal with the thermodynamic process,an augmented state vector of CCGT is ***,the proposed VFA-ADP algorithm is employed to derive the near-optimal real-time operation *** addition,to guarantee the monotonicity of piecewise linear function,we apply the SPAR algorithm in the update *** validate the effectiveness of the proposed method,we conduct experiments with comparisons to some traditional optimization *** results indicate that our proposed ADP method achieves better performance on the economic dispatch of the microgrid.

关键词： Microgrid dynamic Process Combined-Cycle Gas Turbine approximate dynamic programming

来源：评论

学校读者我要写书评

暂无评论

Design Optimization of Direct Heuristic dynamic programming Based on Hybrid Estimation of Distribution Algorithm

Design Optimization of Direct Heuristic Dynamic Programming ...

引用

2015全国理论计算机科学学术年会

作者： Xiong LUO Mi ZHOU Yixuan LV School of Computer and Communication Engineering University of Science and Technology BeijingBeijing 100083China Beijing Key Laboratory of Knowledge Engineering for Materials Science Beijing 100083China

As an important class of approximate dynamic programming, the direct heuristic dynamic programming (DHDP) is discussed in this *** performs well due to its model-free online learning *** the classical DHDP is implemented with gradient-based adaptation learning algorithm of neural network, in this paper we present a design strategy of DHDP with a novel hybrid estimation of distribution algorithm for online learning and control, and the proposed design optimization method achieves the weight training of neural networks with faster convergence *** proposed approach can be viewed as an improvement for *** simulation is conducted on a practical system plant to test the online learning performance by using our ***, the simulation results show the effectiveness of our approach.

关键词： approximate dynamic programming Direct Heuristic dynamic programming Estimation of Distribution Algorithm Online Learning

来源：评论

学校读者我要写书评

暂无评论

Online optimal auto-tuning of PID controllers for tracking in a special class of linear systems

Online optimal auto-tuning of PID controllers for tracking i...

引用

American Control Conference

作者： Marcio F. Miranda Kyriakos G. Vamvoudakis COLTEC Universidade Federal de Minas Gerais Belo Horizonte - MG CEP 31270-901 Brazil Center for Control Dynamicalsystems and Computation (CCDC) University of California Santa Barbara 93106-9560 USA

ISBN: (纸本)9781467386838

This paper proposes a reinforcement learning (RL) algorithm based on approximate dynamic programming to optimally auto-tune a Proportional Integral Derivative (PID) controller by solving an infinite-horizon optimal tracking control problem for a special class of linear systems. The algorithm is based on an actor/critic framework where a critic approximator is used to learn the optimal cost and an actor approximator is used to learn the optimal PID gains. The adaptive control nature of the algorithm requires a persistence of excitation condition to be a-priori validated, but this can be relaxed by using previously stored data concurrently with current data in the tuning of the critic approximator. Simulation results show the effectiveness of the proposed approach for a stirred-tank plant reactor.

关键词： optimal tracking control Auto-tuning PID approximate dynamic programming

来源：评论

学校读者我要写书评

暂无评论

On Integral Value Iteration for Continuous-Time Linear Systems

On Integral Value Iteration for Continuous-Time Linear Syste...

引用

American Control Conference

作者： Jae Young Lee Jin Bae Park Yoon Ho Choi Department of Electrical and Electronic Engineering Yonsei University Shinchon-Dong Seodaemum-Gu Seoul 120-749 Korea Department of Electronic Engineering Kyonggi University Suwon Kyonggi-Do 443-760 Korea

ISBN: (纸本)9781479901777

This paper investigates the properties of integral value iteration (I-VI) which is one of the reinforcement learning (RL) technique for solving online the continuous-time (CT) optimal control problems without using the system drift dynamics. The target I-VI is the one applied to CT linear quadratic regulation problems. As a result, two modes of global monotone convergence of I-VI are presented. One behaves like policy iteration (PI) (PI-mode of convergence) and the other is named VI-mode of convergence. All of the other properties-positive definiteness, stability, and relation between I-VI and integral PI-are presented within these two frameworks. Finally, numerical simulations are carried out to verify and further investigate these properties.

关键词： value iteration LQR reinforcement learning monotone convergence approximate dynamic programming monotone convergence linear quadratic regulators learning (artificial intelligence) iterative methods dynamic programming integration Converge Learning

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：