检索结果-内蒙古大学图书馆

Stochastic dynamic programming solution to transmission scheduling: Multi sensor-multi process with wireless noisy channel

引用

COMPUTERS & ELECTRICAL ENGINEERING 2023年第1期106卷

作者： Forootani, Ali Forootani, Sara Iervolino, Raffaele Zarch, Majid Ghaniee Maynooth Univ Co Hamilton Inst Kildare W23 F2K8 Ireland Univ Sannio Dept Engn I-82100 Benevento Italy Univ Naples Federico II Dept Elect Engn & Informat Technol I-80125 Naples Italy Bu Ali Sina Univ Dept Elect Engn Hamadan Iran

We investigate sensor scheduling for remote estimation when multiple smart sensors monitor multiple stochastic dynamical systems. The sensors transmit their measurements to a remote estimator through a noisy wireless communication channel. Such a remote estimator can receive multiple packets simultaneously sent by local sensors. Sensors transmit their measurements if their Signal Interference and Noise Ratio (SINR) is above a threshold. We compute the optimal policy for sensor scheduling by minimizing expected error covariance subject to total signal transmissions from all sensors. We model this problem as Markov Decision Process (MDP) with discounted cost per stage in the finite time horizon framework, then we employ stochastic dynamic programming as the optimization method. A novel algorithm based on sampling and machine learning techniques is proposed as the approximation. At each phase of the DP algorithm, samples are collected using a uniform probability distribution. The data is used to feed Neural Network (NN) and Random Forest (RF) models for cost function and policy approximation. The results of the proposed framework are supported by simulation examples comparing RF and NN as approximate DP (ADP). Note that this idea builds a bridge among the recent advances in the area of data science, Machine Learning, and the ADP.

关键词： Kalman filter Sensor scheduling Markov Decision Process approximate dynamic programming Random Forest Neural Network

来源：评论

学校读者我要写书评

暂无评论

Mechanism Design for Stochastic dynamic Parking Resource Allocation

引用

PRODUCTION AND OPERATIONS MANAGEMENT 2021年第10期30卷 3615-3634页

作者： Yang, Jie He, Fang Lin, Xi Shen, Max Zuo-Jun Tsinghua Univ Dept Ind Engn Beijing 100084 Peoples R China Tsinghua Univ Dept Civil Engn Beijing 100084 Peoples R China Univ Calif Berkeley Coll Engn Berkeley CA 94704 USA Univ Hong Kong Fac Engn Hong Kong 999077 Peoples R China Univ Hong Kong Fac Business & Econ Hong Kong 999077 Peoples R China

In this paper, we study a parking management problem where an operator manages a publicly owned parking service system with unknown parking demand. Assuming that the operator has perfect information, we first formulate the operator's problem as a stochastic dynamic programming problem, and to overcome the curse of dimensionality, we resort to approximate dynamic programming for solving it. However, in practice, some information that is essential for centralized management is usually privately known, which provides incentives for strategic behaviors of drivers and could lead to suboptimal system performance. We design a two-step mechanism and prove that, in step 1, drivers' choices of whether or not to enter the managed system following the approximate optimal solution satisfy Bayesian-Nash equilibrium (BNE), and in step 2, that truthful reporting is a dominant strategy for all drivers under any circumstance. We investigate the properties of the resulting equilibria, and further modify the mechanism to ensure that the desired approximate system optimum solution is the only resulting BNE. Numerical examples show that the mechanism design not only enhances the average system performance but also increases the system robustness.

关键词： stochastic dynamic parking resource allocation mechanism design approximate dynamic programming strategic behaviors equilibrium analysis

来源：评论

学校读者我要写书评

暂无评论

Quasi-Random Sampling for approximate dynamic programming

Quasi-Random Sampling for Approximate Dynamic Programming

引用

International Joint Conference on Neural Networks

作者： Cristiano Cervellera Mauro Gaggero Danilo Macciò Roberto Marcialis Institute of Intelligent Systems for Automation National Research Council Via De Marini 6 16149 Genova Italy

ISBN: (纸本)9781467361279

This paper analyzes quasi-random sampling techniques for approximate dynamic programming. Specifically, low-discrepancy sequences and lattice point sets are investigated and compared as efficient schemes for uniform sampling of the state space in high-dimensional settings. The convergence analysis of the approximate solution is provided basing on geometric properties of the two discretization methods. It is also shown that such schemes are able to take advantage of regularities of the value functions, possibly through suitable transformations of the state vector. Simulation results concerning optimal management of a water reservoirs system and inventory control are presented to show the effectiveness of the considered techniques with respect to pure-random sampling.

关键词： approximate dynamic programming uniform sampling state space uniform sampling dynamic programming state space Quasi-Random approximate Inventory control

来源：评论

学校读者我要写书评

暂无评论

approximated multi-agent fitted Q iteration

引用

SYSTEMS & CONTROL LETTERS 2023年第1期177卷

作者： Lesage-Landry, Antoine Callaway, Duncan S. Polytech Montreal Dept Elect Engn Mila & GERAD 2500 Polytech Rd Montreal PQ H3T 1J4 Canada Univ Calif Berkeley Energy & Resources Grp 337 Giannini Hall Berkeley CA 94720 USA

We formulate an efficient approximation for multi-agent batch reinforcement learning, the approxi-mated multi-agent fitted Q iteration (AMAFQI). We present a detailed derivation of our approach. We propose an iterative policy search and show that it yields a greedy policy with respect to multiple approximations of the centralized, learned Q-function. In each iteration and policy evaluation, AMAFQI requires a number of computations that scales linearly with the number of agents whereas the analogous number of computations increase exponentially for the fitted Q iteration (FQI), a commonly used approaches in batch reinforcement learning. This property of AMAFQI is fundamental for the design of a tractable multi-agent approach. We evaluate the performance of AMAFQI and compare it to FQI in numerical simulations. The simulations illustrate the significant computation time reduction when using AMAFQI instead of FQI in multi-agent problems and corroborate the similar performance of both approaches. & COPY;2023 Elsevier B.V. All rights reserved.

关键词： approximate dynamic programming Batch reinforcement learning Markov decision process Multi-agent reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Overbooking with Bounded Loss

引用

MATHEMATICS OF OPERATIONS RESEARCH 2023年第3期48卷 1344-1363页

作者： Freund, Daniel Zhao, Jiayu (Kamessi) MIT Sloan Sch Management Cambridge MA 02142 USA MIT Operat Res Ctr Cambridge MA 02142 USA

We study a classic problem in revenue management: quantity-based, single resource revenue management with no-shows. In this problem, a firm observes a sequence of T customers requesting a service. Each arrival is drawn independently from a known distribution of k different types, and the firm needs to decide irrevocably whether to accept or reject requests in an online fashion. The firm has a capacity of resources B and wants to maximize its profit. Each accepted service request yields a type-dependent revenue and has a type-dependent probability of requiring a resource once all arrivals have occurred (or be a no-show). If the number of accepted arrivals that require a resource at the end of the horizon is greater than B, the firm needs to pay a fixed compensation for each service request that it cannot fulfill. With a clairvoyant that knows all arrivals ahead of time, as a benchmark, we provide an algorithm with a uniform additive loss bound, that is, its expected loss is inde omega(root T) pendent of T. This improves upon prior works achieving omega(T) guarantees.

关键词： online stochastic decision making approximate dynamic programming revenue management overbooking capacity management

来源：评论

学校读者我要写书评

暂无评论

Opportunities for reinforcement learning in stochastic dynamic vehicle routing

引用

COMPUTERS & OPERATIONS RESEARCH 2023年 150卷

作者： Hildebrandt, Florentin D. Thomas, Barrett W. Ulmer, Marlin W. Otto von Guericke Univ Dept Management Sci Magdeburg Germany Univ Iowa Dept Business Analyt Iowa City IA USA

There has been a paradigm-shift in urban logistic services in the last years;demand for real-time, instant mobility and delivery services grows. This poses new challenges to logistic service providers as the underlying stochastic dynamic vehicle routing problems (SDVRPs) require anticipatory real-time routing actions. The complexity of finding efficient routing actions is multiplied by the challenge of evaluating such actions with respect to their effectiveness given future dynamism and uncertainty. Reinforcement learning (RL) is a promising tool for evaluating actions but it is not designed for searching the complex and combinatorial action space. Thus, past work on RL for SDVRP has either restricted the action space, that is solving only subproblems by RL and everything else by established heuristics, or focused on problems that reduce to resource allocation problems. For solving real-world SDVRPs, new strategies are required that address the combined challenge of combinatorial, constrained action space and future uncertainty, but as our findings suggest, such strategies are essentially non-existing. Our survey paper shows that past work relied either on action-space restriction or avoided routing actions entirely and highlights opportunities for more holistic solutions.

关键词： Stochastic dynamic vehicle routing Reinforcement learning approximate dynamic programming Mixed integer programming Combinatorial optimization Survey

来源：评论

学校读者我要写书评

暂无评论

Economic Dispatch of an Integrated Microgrid Based on the dynamic Process of CCGT Plant 33

Economic Dispatch of an Integrated Microgrid Based on the Dy...

引用

33rd Chinese Control and Decision Conference (CCDC)

作者： Lin, Zhiyi Song, Chunyue Zhao, Jun Yang, Chao Yin, Huan Zhejiang Univ Coll Control Sci & Engn Hangzhou 310027 Peoples R China

ISBN: (纸本)9781665440899

Intra-day economic dispatch of an integrated microgrid is a fundamental requirement to integrate distributed generators. The dynamic energy flows in cogeneration units present challenges to the energy management of the microgrid. In this paper, a novel approximate dynamic programming (ADP) approach is proposed to solve this problem based on value function approximation, which is distinct with the consideration of the dynamic process constraints of the combined-cycle gas turbine (CCGT) plant. First, we mathematically formulate the multi-time periods decision problem as a finite-horizon Markov decision process. To deal with the thermodynamic process, an augmented state vector of CCGT is introduced. Second, the proposed VFA-ADP algorithm is employed to derive the near-optimal real-time operation strategies. In addition, to guarantee the monotonicity of piecewise linear function, we apply the SPAR algorithm in the update process. To validate the effectiveness of the proposed method, we conduct experiments with comparisons to some traditional optimization methods. The results indicate that our proposed ADP method achieves better performance on the economic dispatch of the microgrid.

关键词： Microgrid dynamic Process Combined-Cycle Gas Turbine approximate dynamic programming

来源：评论

学校读者我要写书评

暂无评论

Enhancing Safety and Multifaceted Preferences to Optimise Cycling Routes for Cyclist-Centric Urban Mobility

引用

INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS 2023年第12期14卷 982-987页

作者： Alatiyyah, Mohammed Prince Sattam bin Abdulaziz Univ Coll Sci & Humanities Aflaj Dept Comp Sci Al Kharj Saudi Arabia

In order to optimise bicycle routes across a variety of multiple parameters, including safety, efficiency and subtle rider preferences, this work explores the difficult domain of the Bike Routing Problem (BRP) using a sophisticated Simulated Annealing approach. In this innovative structure, a wide range of limitations and inclinations are combined and carefully calibrated to create routes that skillfully meet the varied and changing needs of cyclists. Extensive testing on a dataset representing a range of rider preferences demonstrates the effectiveness of this novel approach, resulting in significant improvements in route selection. This research is a significant resource for urban planners and politicians. Its data-driven solutions and strategic recommendations will help them strengthen bicycle infrastructure, even beyond its immediate applicability in resolving the BRP.

关键词： Bike routing dynamic vehicle routing inventory routing approximate dynamic programming

来源：评论

学校读者我要写书评

暂无评论

dynamic multistage scheduling for patient-centered care plans

引用

HEALTH CARE MANAGEMENT SCIENCE 2021年第4期24卷 827-844页

作者： Diamant, Adam York Univ Schulich Sch Business 111 Ian Macdonald Blvd Toronto ON M3J 1P3 Canada

We investigate the scheduling practices of multistage outpatient health programs that offer care plans customized to the needs of their patients. We formulate the scheduling problem as a Markov decision process (MDP) where patients can reschedule their appointment, may fail to show up, and may become ineligible. The MDP has an exponentially large state space and thus, we introduce a linear approximation to the value function. We then formulate an approximate dynamic program (ADP) and implement a dual variable aggregation procedure. This reduces the size of the ADP while still producing dual cost estimates that can be used to identify favorable scheduling actions. We use our scheduling model to study the effectiveness of customized-care plans for a heterogeneous patient population and find that system performance is better than clinics that do not offer such plans. We also demonstrate that our scheduling approach improves clinic profitability, increases throughput, and decreases practitioner idleness as compared to a policy that mimics human schedulers and a policy derived from a deep neural network. Finally, we show that our approach is fairly robust to errors introduced when practitioners inadvertently assign patients to the wrong care plan.

关键词： Healthcare Appointment scheduling Multiple treatment stages Customized care plans approximate dynamic programming Dual variable aggregation Operations research

来源：评论

学校读者我要写书评

暂无评论

Abstraction-based branch and bound approach to Q-learning for hybrid optimal control 3

Abstraction-based branch and bound approach to Q-learning fo...

引用

3rd Annual Conference on Learning for dynamics and Control (L4DC)

作者： Legat, Benoit Jungers, Raphael M. Bouchat, Jean UCLouvain ICTEAM 4 Ave G Lemaitre B-1348 Louvain La Neuve Belgium UCLouvain ELI 2 Croix Sud B-1348 Louvain La Neuve Belgium

In this paper, we design a theoretical framework allowing to apply model predictive control on hybrid systems. For this, we develop a theory of approximate dynamic programming by leveraging the concept of alternating simulation. We show how to combine these notions in a branch and bound algorithm that can further refine the Q-functions using Lagrangian duality. We illustrate the approach on a numerical example.

关键词： Hybrid systems reinforcement learning approximate dynamic programming branch and bound

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：