检索结果-内蒙古大学图书馆

Identifying cost-effective dynamic policies to control epidemics

STATISTICS IN MEDICINE 2016年第28期35卷 5189-5209页

作者： Yaesoubi, Reza Cohen, Ted Yale Sch Publ Hlth Hlth Policy & Management 60 Coll St New Haven CT 06520 USA Yale Sch Publ Hlth Epidemiol Microbial Dis 60 Coll St New Haven CT 06520 USA

We describe a mathematical decision model for identifying dynamic health policies for controlling epidemics. These dynamic policies aim to select the best current intervention based on accumulating epidemic data and the availability of resources at each decision point. We propose an algorithm to approximate dynamic policies that optimize the population's net health benefit, a performance measure which accounts for both health and monetary outcomes. We further illustrate how dynamic policies can be defined and optimized for the control of a novel viral pathogen, where a policy maker must decide (i) when to employ or lift a transmission-reducing intervention (e.g. school closure) and (ii) how to prioritize population members for vaccination when a limited quantity of vaccines first become available. Within the context of this application, we demonstrate that dynamic policies can produce higher net health benefit than more commonly described static policies that specify a pre-determined sequence of interventions to employ throughout epidemics. Copyright (c) 2016 John Wiley & Sons, Ltd.

关键词： epidemics dynamic resource allocation approximate dynamic programming approximate policy iteration influenza H1N1

来源：评论

学校读者我要写书评

暂无评论

Improving Quality of Prediction in Highly dynamic Environments Using approximate dynamic programming

引用

QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL 2010年第7期26卷 717-732页

作者： Ganesan, Rajesh Balakrishna, Poornima Sherry, Lance George Mason Univ Ctr Air Transportat Syst Res Dept Syst Engn & Operat Res Fairfax VA 22030 USA

In many applications, decision making under uncertainty often involves two steps-prediction of a certain quality parameter or indicator of the system under study and the subsequent use of the prediction in choosing actions. The prediction process is severely challenged by highly dynamic environments that particularly involve sequential decision making, such as air traffic control at airports in which congestion prediction is critical for smooth departure operations. Taxi-out time of a flight is an excellent indicator of surface congestion and is a quality parameter used in the assessment of airport delays. The regression, queueing, and moving average models have been shown to perform poorly in predicting taxi-out times because they are slow in adapting to the changing airport dynamics. This paper presents an approximate dynamic programming approach (reinforcement learning, RL) to taxi-out time prediction. The taxi-out prediction performance was tested on flight data obtained from the Federal Aviation Administration's (FAA) Aviation System Performance Metrics (ASPM) database on Detroit International (DTW), Washington Reagan National (DCA), Boston (BOS), New York John F. Kennedy (JFK) and Tampa International (TPA) airports. For example, at the Boston airport (presented in detail) the prediction accuracy by RI model was 14% higher than the queueing model and 39% higher than a running-average model. In general, the RL model was 35-50% more accurate than the regression model for all of the above airports. Copyright (c) 2010 John Wiley & Sons, Ltd.

关键词： air transportation approximate dynamic programming quality of prediction reinforcement learning taxi-out time

来源：评论

学校读者我要写书评

暂无评论

approximate linear programming for networks: Average cost bounds

引用

COMPUTERS & OPERATIONS RESEARCH 2015年 63卷 32-45页

作者： Veatch, Michael H. Gordon Coll Dept Math Wenham MA 01984 USA

This paper uses approximate linear programming (ALP) to compute average cost bounds for queueing network control problems. Like most approximate dynamic programming (ADP) methods, ALP approximates the differential cost by a linear form. New types of approximating functions are identified that offer more accuracy than previous ALP studies or other performance bound methods. The structure of the infinite constraint set is exploited to reduce it to a more manageable set. When needed, constraint sampling and truncation methods are also developed. Numerical experiments show that the LPs using quadratic approximating functions can be easily solved on examples with up to 17 buffers. Using additional functions reduced the error to 1-5% at the cost of larger LPs. These ALPs were solved for systems with up to 6-11 buffers, depending on the functions used. The method computes bounds much faster than value iteration. It also gives some insights into policies. The ALPs do not scale to very large problems, but they offer more accurate bounds than other methods and the simplicity of just solving an LP. (C) 2015 Elsevier Ltd. All rights reserved.

关键词： Queueing network approximate dynamic programming Linear programming

来源：评论

学校读者我要写书评

暂无评论

Discrete-Time Optimal Control Scheme Based on Q-Learning Algorithm 7

Discrete-Time Optimal Control Scheme Based on <i>Q</i>-Learn...

引用

7th International Conference on Intelligent Control and Information Processing (ICICIP)

作者： Wei, Qinglai Liu, Derong Song, Ruizhuo Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China Univ Sci & Technol Beijing Sch Automat & Elect Engn Beijing 100083 Peoples R China

ISBN: (纸本)9781509021550

This paper is concerned with optimal control problems of discrete-time nonlinear systems via a novel Q-learning algorithm. In the newly developed Q-learning algorithm, the iterative Q function in each iteration is required to update on the whole state and control spaces, instead of being updated by a single state and control pair. A new convergence criterion of the corresponding Q-learning algorithm is presented, where the traditional constraints for the learning rates of Q-learning algorithms is relaxed. Finally, simulation results are provided to exemplify the good performance of the developed algorithm.

关键词： Adaptive critic designs adaptive dynamic programming approximate dynamic programming neuro-dynamic programming Q-learning optimal control neural networks reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Algorithmic Solutions for Optimal Switching Problems 2

Algorithmic Solutions for Optimal Switching Problems

引用

2nd International Symposium on Stochastic Models in Reliability Engineering, Life Science, and Operations Management (SMRLO)

作者： Hinz, Juri Yee, Jeremy Univ Technol Sydney Sch Math Sydney NSW Australia

ISBN: (纸本)9781467399418

In practice, optimal control problems of stochastic switching are notoriously challenging from a computational viewpoint, since typical real-world applications are high dimensional. In this approach, we suggest an algorithmic solution which is based on some convexity assumptions frequently fulfilled in applications. Furthermore, we show how the quality of numerical solution can be assessed. An efficient implementation of our algorithms is discussed.

关键词： approximate dynamic programming convex switching systems Markov decision processes optimal switching

来源：评论

学校读者我要写书评

暂无评论

Improving Quality of Prediction in Highly dynamic Environments Using approximate dynamic programming

Improving Quality of Prediction in Highly Dynamic Environmen...

引用

INFORMS Annual Meeting on Recent Advancements in Quality and Reliability

作者： Ganesan, Rajesh Balakrishna, Poornima Sherry, Lance George Mason Univ Ctr Air Transportat Syst Res Dept Syst Engn & Operat Res Fairfax VA 22030 USA

关键词： air transportation approximate dynamic programming quality of prediction reinforcement learning taxi-out time

来源：评论

学校读者我要写书评

暂无评论

Discrete-Time Two-Player Zero-Sum Games for Nonlinear Systems Using Iterative Adaptive dynamic programming 13th

Discrete-Time Two-Player Zero-Sum Games for Nonlinear System...

引用

13th International Symposium on Neural Networks (ISNN)

作者： Wei, Qinglai Liu, Derong Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China Univ Sci & Technol Beijing Sch Automat & Elect Engn Beijing 100083 Peoples R China

ISBN: (纸本)9783319406633;9783319406626

This paper is concerned with a discrete-time two-player zero-sum game of nonlinear systems, which is solved by a new iterative adaptive dynamic programming (ADP) method. In the present iterative ADP algorithm, two iteration procedures, which are upper and lower iterations, are implemented to obtain the upper and lower performance index functions, respectively. Initialized by an arbitrary positive semi-definite function, it is shown that the iterative value functions converge to the optimal performance index function if the optimal performance index function of the two-player zero-sum game exists. Finally, simulation results are given to illustrate the performance of the developed method.

关键词： Adaptive critic designs Adaptive dynamic programming approximate dynamic programming Neuro-dynamic programming Zero-sum game Optimal control

来源：评论

学校读者我要写书评

暂无评论

Dual MPC with Reinforcement Learning

Dual MPC with Reinforcement Learning

引用

11th IFAC Symposium on dynamics and Control of Process Systems including Biosystems

作者： Morinelly, Juan E. Ydstie, B. Erik Carnegie Mellon Univ Dept Chem Engn Pittsburgh PA 15213 USA

An adaptive optimal control algorithm for system with uncertain dynamics is formulated under a Reinforcement Learning framework. An embedded exploratory component, is included explicitly in the objective function of an output feedback receding horizon Model Predictive Control problem. The optimization is formulated as a Quadratically Constrained Quadratic Program and it is solved to epsilon-global optimality. The iterative interaction between the action specified by the optimal solution and the approximation of cost functions balances the exploitation of current knowledge and the need for exploration. The proposed method is shown to converge to the optimal policy for a controllable discrete time linear plant with unknown output parameters. (C) 2016, IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved.

关键词： Adaptive control dual control optimal control model predictive control reinforcement learning approximate dynamic programming

来源：评论

学校读者我要写书评

暂无评论

Conversion of MDP Problems into Heuristics Based Planning Problems using Temporal Decomposition 13

Conversion of MDP Problems into Heuristics Based Planning Pr...

引用

13th International Bhurban Conference on Applied Sciences and Technology (IBCAST)

作者： Gillani, Rida Nasir, Ali Univ Cent Punjab Dept Elect Engn Lahore Pakistan

ISBN: (纸本)9781467391276

This paper presents an approach for recasting Markov Decision Process (MDP) problems into heuristics based planning problems. The basic idea is to use temporal decomposition of the state space based on a subset of state space referred to as termination sample space. Specifically, the recasting of MDP problems is done in three steps. First step is to define a state space adaptation criterion based on the termination sample space. Second step is to define an action selection heuristic from each state. Third and final step is to define a recursion or backtracking methodology to avoid dead ends and infinite loops. All three steps have been described and discussed. A case study involving fault detection and alarm generation for the reaction wheels of a satellite mission has been discussed. The proposed approach has been compared with existing approaches for recasting MDP problems using the case study. Computational reduction achieved by the proposed approach is evident from the results.

关键词： Markov Decision Processes Temporal Decomposition Heuristics Based Planning approximate dynamic programming

来源：评论

学校读者我要写书评

暂无评论

Itinerary-based nesting control with upsell

引用

JOURNAL OF REVENUE AND PRICING MANAGEMENT 2016年第2期15卷 107-137页

作者： Pun, Chan Seng Klabjan, Diego Karaesmen, Fikri Shebalov, Sergey Northwestern Univ Evanston IL 60208 USA Sabre Holdings Southlake TX USA Koc Univ Istanbul Turkey

In order to accept future high-yield booking requests, airlines protect seats from low-yield passengers. More seats may be reserved when passengers faced with closed fare classes can upsell to open higher fare classes. We address the airline revenue management problem with capacity nesting and customer upsell, and formulate this problem by a stochastic optimization model to determine a set of static protection levels for each itinerary. We apply an approximate dynamic programming framework to approximate the objective function by piecewise linear functions, whose slopes (marginal revenue) are iteratively updated and returned by an efficient heuristic that simultaneous handles both nesting and upsells. The resulting allocation policy is tested over a real airline network and benchmarked against the randomized linear programming bid-price policy under various demand settings. Simulation results suggest that the proposed allocation policy significantly outperforms when incremental demand or upsell probability are high. Structural analyses are also provided for special demand dependence cases.

关键词： network revenue management capacity nesting customer upsell approximate dynamic programming

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：