检索结果-内蒙古大学图书馆

dynamic surgery management under uncertainty

EUROPEAN JOURNAL OF OPERATIONAL RESEARCH 2023年第2期309卷 832-844页

作者： Gokalp, E. Gulpinar, N. Doan, V. X. Univ Bath Sch Management Bath BA2 7AY England Univ Warwick Warwick Business Sch Coventry CV4 7AL England

Real-time surgery management involves a complex and dynamic decision-making process. The duration of surgeries in many cases cannot be known until the surgery has actually been completed. Furthermore, disruptions such as equipment failure or the arrival of a non-elective surgery can occur simultaneously. Thus, the assignment of surgeries needs to be updated, as and when disruptions occur, to minimize their effects. In this paper, we present a stochastic dynamic programming approach to the surgery allocation problem with multiple operating rooms under uncertainty. Given an elective list for the day, the dy-namic optimization model minimizes the number of surgeries not carried out by the end of the shift and the total waiting times of patients during the day weighted according to their urgency level. Due to the curse of dimensionality, we apply an approximate dynamic programming algorithm to solve the stochastic dynamic surgery management model. Computational experiments are designed to demonstrate the performance of the proposed algorithm and its applicability to practical settings. The results show that the approximate dynamic programming algorithm provides a good approximation to the optimum policy and leads to some managerial insights. (c) 2023 Elsevier B.V. All rights reserved.

关键词： Reactive scheduling Uncertainty modelling Surgery management approximate dynamic programming

来源：评论

学校读者我要写书评

暂无评论

GOPS:A general optimal control problem solver for autonomous driving and industrial control applications

引用

Communications in Transportation Research 2023年第1期3卷 92-106页

作者： Wenxuan Wang Yuhang Zhang Jiaxin Gao Yuxuan Jiang Yujie Yang Zhilong Zheng Wenjun Zou Jie Li Congsheng Zhang Wenhan Cao Genjin Xie Jingliang Duan Shengbo Eben Li School of Vehicle and Mobility Tsinghua UniversityBeijing100084China

Solving optimal control problems serves as the basic demand of industrial control *** methods like model predictive control often suffer from heavy online computational *** learning has shown promise in computer and board games but has yet to be widely adopted in industrial applications due to a lack of accessible,high-accuracy *** Reinforcement learning(RL)solvers are often developed for academic research and require a significant amount of theoretical knowledge and programming ***,many of them only support Python-based environments and limit to model-free *** address this gap,this paper develops General Optimal control Problems Solver(GOPS),an easy-to-use RL solver package that aims to build real-time and high-performance controllers in industrial *** is built with a highly modular structure that retains a flexible framework for secondary *** the diversity of industrial control tasks,GOPS also includes a conversion tool that allows for the use of Matlab/Simulink to support environment construction,controller design,and performance *** handle large-scale problems,GOPS can automatically create various serial and parallel trainers by flexibly combining embedded buffers and *** offers a variety of common approximate functions for policy and value functions,including polynomial,multilayer perceptron,convolutional neural network,***,constrained and robust algorithms for special industrial control systems with state constraints and model uncertainties are also integrated into *** examples,including linear quadratic control,inverted double pendulum,vehicle tracking,humanoid robot,obstacle avoidance,and active suspension control,are tested to verify the performances of GOPS.

关键词： Industrial control Reinforcement learning approximate dynamic programming Optimal control Neural network Benchmark

来源：评论

学校读者我要写书评

暂无评论

Capacity planning in a decentralized autologous cell therapy manufacturing network for low-cost resilience

引用

FLEXIBLE SERVICES AND MANUFACTURING JOURNAL 2023年第2期35卷 295-319页

作者： Li, Junxuan White, Chelsea C. Microsoft Corp Redmond WA 98052 USA Georgia Inst Technol Atlanta GA 30332 USA

The goals for increased patient access and fast fulfillment have motivated considerable interest in autologous cell therapy manufacturing networks having multiple and geographically distributed manufacturing facilities. However, the cost of safety manufacturing capacity to mitigate supplier disruption risk-a significant risk in the emerging cell manufacturing industry-would be lower if manufacturing is centralized. In this paper, we analyze a decentralized network that has as its objective to minimize the cost of network resilience for mitigating supplier disruption by making use of the fact that bioreactors for autologous therapy manufacturing are small enough to be relocatable. We model this problem as a Markov decision process and develop efficient algorithms that are based on real-time demand data to minimize safety manufacturing capacity and determine how relocatable capacity should be distributed while satisfying resilience constraints. In case studies, based in part on data collected from a Chimeric antigen receptor T cell therapy manufacturing facility at the University of Pennsylvania, we compare decentralized network models with different heuristic algorithms. Results indicate that transshipment in a decentralized network can result in a significant reduction of required safety capacity, reducing the cost of network resilience.

关键词： Autologous cell therapy dynamic resilience Decentralized manufacturing netwrok Capacity planning approximate dynamic programming

来源：评论

学校读者我要写书评

暂无评论

The Exponomial Choice Model for Assortment Optimization: An Alternative to the MNL Model?

引用

MANAGEMENT SCIENCE 2023年第5期69卷 2814-2832页

作者： Aouad, Ali Feldman, Jacob Segev, Danny London Business Sch London NW1 4SA England Washington Univ Olin Business Sch St Louis MO 63130 USA Tel Aviv Univ Dept Stat & Operat Res IL-69978 Tel Aviv Israel

In this paper, we consider the yet-uncharted assortment optimization problem under the exponomial choice model, in which the objective is to determine the revenue-maximizing set of products that should be offered to customers. Ourmain algorithmic contribution comes in the form of a fully polynomial-time approximation scheme, showing that the optimal expected revenue can be efficiently approached within any degree of accuracy. We synthesize several ideas related to approximate dynamic programming, intended to construct a compact discretization of the continuous state space by keeping track of "key statistics" in rounded form and by operating with a suitable bit precision complexity. We complement this result by a number of NP-hardness reductions to natural extensions of this problem. Moreover, we conduct empirical and computational evaluations of the exponomial choice model and our solution method. Focusing on choice models with a simple parametric structure, we provide new empirical evidence that the exponomial choice model can achieve higher predictive accuracy than the multinomial logit (MNL) choice model on several real-world data sets. We uncover that this predictive performance correlates with certain characteristics of the choice instance-namely, the entropy and magnitude of choice probabilities. Finally, we leverage fully ranked preference data to simulate the expected revenue of optimal assortments prescribed using the fitted exponomial and MNL models. On semisynthetic data, the exponomial-based approach can lift revenues by 3%-4% on average against the corresponding MNL benchmark.

关键词： assortment optimization FPTAS approximate dynamic programming case study

来源：评论

学校读者我要写书评

暂无评论

Approximation Benefits of Policy Gradient Methods with Aggregated States

引用

MANAGEMENT SCIENCE 2023年第11期69卷 6898-6911页

作者： Russo, Daniel Columbia Univ Grad Sch Business New York NY 10027 USA

Folklore suggests that policy gradient can be more robust to misspecification than its relative, approximate policy iteration. This paper studies the case of state-aggregated representations, in which the state space is partitioned and either the policy or value function approximation is held constant over partitions. This paper shows a policy gradient method converges to a policy whose regret per period is bounded by epsilon, the largest difference between two elements of the state-action value function belonging to a common partition. With the same representation, both approximate policy iteration and approximate value iteration can produce policies whose per-period regret scales as epsilon/(1-gamma), where. is a discount factor. Faced with inherent approximation error, methods that locally optimize the true decision objective can be far more robust.

关键词： reinforcement learning approximate dynamic programming policy gradient methods state aggregation

来源：评论

学校读者我要写书评

暂无评论

Optimal Tracking Control for Ship Course Using approximate dynamic programming Method

Optimal Tracking Control for Ship Course Using Approximate D...

引用

第三十二届中国控制会议

作者： XIE Qingqing LUO Bin TAN Fuxiao School of Computer Science and Technology Anhui University Key Laboratory of Intelligent Computing ＆ Signal Processing Ministry of EducationAnhui University School of Computer and Information Fuyang Teachers college

ISBN: (纸本)9781479900305

dynamic programming(DP) is not a useful tool for solving many control problems because of its complexity in computation. In this paper,we propose approximate dynamic programming(ADP) optimal control strategy for ship course trajectory tracking control *** system transformation,we convert the optimal tracking problem into designing a infinite-horizon optimal regulator for the tracking error ***-dependent Heuristic dynamic programming(ADHDP) technique,as one form of ADR is presented to obtain the infinite-horizon optimal tracking *** the ship course optimal tracking control simulation results,we can see that the ADHDP controller makes the performance index and the control sequence for the error dynamics converge to the optimal *** BP neural networks are used as parametric structures to implement ADHDP *** two neural networks aim at approximating the cost function and the control law,respectively.

关键词： Optimal Tracking Control approximate dynamic programming ADHDP Ship Course

来源：评论

学校读者我要写书评

暂无评论

Reinforcement learning for healthcare operations management: methodological framework, recent developments, and future research directions

引用

HEALTH CARE MANAGEMENT SCIENCE 2025年 1-36页

作者： Wu, Qihao Han, Jiangxue Yan, Yimo Kuo, Yong-Hong Shen, Zuo-Jun Max Univ Hong Kong Dept Data & Syst Engn Hong Kong Peoples R China Univ Hong Kong Fac Engn Hong Kong Peoples R China Univ Hong Kong Business Sch Hong Kong Peoples R China Univ Calif Berkeley Dept Ind Engn & Operat Res Berkeley CA USA

With the advancement in computing power and data science techniques, reinforcement learning (RL) has emerged as a powerful tool for decision-making problems in complex systems. In recent years, the research on RL for healthcare operations has grown rapidly. Especially during the COVID-19 pandemic, RL has played a critical role in optimizing decisions with greater degrees of uncertainty. RL for healthcare applications has been an exciting topic across multiple disciplines, including operations research, operations management, healthcare systems engineering, and data science. This review paper first provides a tutorial on the overall framework of RL, including its key components, training models, and approximators. Then, we present the recent advances of RL in the domain of healthcare operations management (HOM) and analyze the current trends. Our paper concludes by presenting existing challenges and future directions for RL in HOM.

关键词： Reinforcement learning Healthcare operations Healthcare services delivery Markov decision process approximate dynamic programming Neural networks

来源：评论

学校读者我要写书评

暂无评论

Reinforcement learning versus data-driven dynamic programming: a comparison for finite horizon dynamic pricing markets

引用

JOURNAL OF REVENUE AND PRICING MANAGEMENT 2025年 1-17页

作者： Lange, Fabian Dreessen, Leonard Schlosser, Rainer Univ Potsdam Hasso Plattner Inst Potsdam Germany

Revenue management (RM) plays a vital role to optimize sales processes in real-life applications under incomplete information. The prediction of consumer demand and the anticipation of price reactions of competitors became key factors in RM to be able to apply classical dynamic programming (DP) methods for expected long-term reward maximization. Modern model-free deep Reinforcement Learning (RL) approaches are able to derive optimized policies without explicit estimations of underlying model dynamics. However, RL algorithms typically require either vast amounts of training data or a suitable synthetic model to be trained on. As existing studies focus on one group of algorithms only, the relation between established DP approaches and new RL techniques is opaque. To address this issue, in this paper, we use a dynamic pricing framework for an airline ticket market to compare state-of-the-art RL algorithms and data-driven versions of classic DP methods regarding (i) performance and (ii) required data to each other. For the DP techniques, we use estimations of market dynamics to be able to compare their performance and data consumption against RL methods. The numerical results of our experiments, which include monopoly as well as duopoly markets, allow to study how the different approaches' performances relate to each other in exemplary settings. In both setups, we find that with few data (about 10 episodes) fitted DP methods were highly competitive;with medium amounts of data (about 100 episodes) DP methods got outperformed by RL, where PPO provided the best results. Given large amounts of training data (about 1000 episodes), the best RL algorithms, i.e., TD3, DDPG, PPO, and SAC, performed similarly achieving about 90% and more of the optimal solution.

关键词： dynamic pricing Decision support Method comparison approximate dynamic programming Reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Microgrid Energy Management based on approximate dynamic programming

Microgrid Energy Management based on Approximate Dynamic Pro...

引用

4th IEEE/PES Innovative Smart Grid Technologies Europe (ISGT EUROPE)

作者： Strelec, Martin Berka, Jan Honeywell Spol Sro Honeywell Prague Lab Prague Czech Republic

ISBN: (纸本)9781479929849

Microgrid energy management stands for challenging optimization problem where continuous (economic dispatch) and discrete optimization (unit commitment) tasks are solved. Often Microgrid optimization leads to complex problem where optimization methods usually meet curse of dimensionality. We adopt approximate dynamic programming (ADP) as the promising optimization technique which can overcome curse of dimensionality. In this paper, energy management system based on ADP is introduced and its behavior is demonstrated on small scale Microgrid which is connected to distribution network and includes wind turbine, chiller plant, thermal storage and cooling load. The paper describes policy search approach to ADP and selected approximation architectures in the context of energy optimization. The ADP results are compared with the results of the solution based on dynamic programming approach.

关键词： Microgrid energy management approximate dynamic programming

来源：评论

学校读者我要写书评

暂无评论

approximate dynamic programming Based Controller Design Using an Improved Learning Algorithm with Application to Tracking Control of Aircraft

Approximate Dynamic Programming Based Controller Design Usin...

引用

2013年中国智能自动化学术会议

作者： Xiong Luo Yuchao Zhou Zengqi Sun School of Computer and Communication Engineering University of Science and Technology Beijing(USTB) Beijing Key Laboratory of Knowledge Engineering for Materials Science Key Laboratory of Knowledge Engineering for Materials Science Department of Computer Science and Technology Tsinghua University

The strategy using approximate/adaptive dynamic programming(ADP) has been widely used to design a learning controller for complex systems of higher dimension in recent *** paper aims at handling an important problem in the design of ADP learning controllers,which is the improvement of learning algorithm for its convergence *** analyze ADP controller implementation framework according to the requirement of tracking control task,with emphasis on providing an improved weight-updating gradient descent approach in optimizing connection weights in network structures.A comparison of the proposed method and classic ADP design for tracking and controlling pitch angle of aircraft is *** verifies the feasibility in the design of the proposed ADP based controller.

关键词： approximate dynamic programming Controller Learning algorithm Aircraft control

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：