检索结果-内蒙古大学图书馆

Strategic Capacity Decision-Making in a Stochastic Manufacturing Environment Using Real-Time approximate dynamic programming

引用

NAVAL RESEARCH LOGISTICS 2010年第3期57卷 211-224页

作者： Pratikakis, Nikolaos E. Realff, Matthew J. Lee, Jay H. Georgia Inst Technol Sch Chem & Biomol Engn Atlanta GA 30332 USA

In this study, we illustrate a real-time approximate dynamic programming (RTADP) method for solving multistage capacity decision problems in a stochastic manufacturing environment, by using an exemplary three-stage manufacturing system with recycle. The system is a moderate size queuing network, which experiences stochastic variations in demand and product yield. The dynamic capacity decision problem is formulated as a Markov decision process (MDP). The proposed RTADP method starts with a set of heuristics and learns a superior quality solution by interacting with the stochastic system via simulation. The curse-of-dimensionality associated with DP methods is alleviated by the adoption of several notions including "evolving set of relevant states," for which the value function table is built and updated, "adaptive action set" for keeping track of attractive action candidates, and "nonparametric k nearest neighbor averager" for value function approximation. The performance of the learned solution is evaluated against (1) an "ideal" Solution derived using a mixed integer programming (MIP) formulation, which assumes full knowledge of future realized values of the stochastic variables (2) a myopic heuristic solution, and (3) a sample path based rolling horizon MIP solution. The policy learned through the RTADP method turned out to be superior to polices of 2 and 3. (C) 2010 Wiley Periodicals, Inc. Naval Research Logistics 57: 211-224, 2010

关键词： queueing networks approximate dynamic programming real-time dynamic programming capacity planning

来源：评论

学校读者我要写书评

暂无评论

Using approximate dynamic programming for multi-ESM scheduling to track ground moving targets

引用

Journal of Systems Engineering and Electronics 2018年第1期29卷 74-85页

作者： WAN Kaifang GAO Xiaoguang LI Bo LI Fei Key Laboratory of Aerospace Information Perception and Photoelectric Control Ministry of EducationSchool of Electronics and Information Northwestern Polytechnical University

This paper researches the adaptive scheduling problem of multiple electronic support measures(multi-ESM) in a ground moving radar targets tracking application. It is a sequential decision-making problem in uncertain environment. For adaptive selection of appropriate ESMs, we generalize an approximate dynamic programming(ADP) framework to the dynamic case. We define the environment model and agent model, respectively. To handle the partially observable challenge, we apply the unsented Kalman filter(UKF) algorithm for belief state estimation. To reduce the computational burden, a simulation-based approach rollout with a redesigned base policy is proposed to approximate the long-term cumulative reward. Meanwhile, Monte Carlo sampling is combined into the rollout to estimate the expectation of the rewards. The experiments indicate that our method outperforms other strategies due to its better performance in larger-scale problems.

关键词： sensor scheduling target tracking approximate dynamic programming non-myopic rollout belief state

来源：评论

学校读者我要写书评

暂无评论

An Integrated Decomposition and approximate dynamic programming Approach for On-Demand Ride Pooling

引用

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS 2020年第9期21卷 3811-3820页

作者： Yu, Xian Shen, Siqian Univ Michigan Dept Ind & Operat Engn Ann Arbor MI 48109 USA

Through smartphone apps, drivers and passengers can dynamically enter and leave ride-hailing platforms. As a result, ride-pooling is challenging due to complex system dynamics and different objectives of multiple stakeholders. In this paper, we study ride-pooling with no more than two passenger groups who can share rides in the same vehicle. We dynamically match available drivers to randomly arriving passengers and also decide pick-up and drop-off routes. The goal is to minimize a weighted sum of passengers' waiting time and trip delay time. A spatial-and-temporal decomposition heuristic is applied and each subproblem is solved using approximate dynamic programming (ADP), for which we show properties of the approximated value function at each stage. Our model is benchmarked with the one that optimizes vehicle dispatch without ride-pooling and the one that matches current drivers and passengers without demand forecasting. Using test instances generated based on the New York City taxi data during one peak hour, we conduct computational studies and sensitivity analysis to show (i) empirical convergence of ADP, (ii) benefit of ride-pooling, and (iii) value of future supply-demand information.

关键词： Vehicles Delays Computational modeling Optimization Vehicle dynamics Stochastic processes dynamic programming Mobility on demand (MoD) supply-demand uncertainty ride-pooling spatial-temporal decomposition approximate dynamic programming

来源：评论

学校读者我要写书评

暂无评论

A new approximate dynamic programming algorithm based on an actor?critic framework for optimal control of alkali?surfactant?polymer flooding

引用

ENGINEERING OPTIMIZATION 2019年第12期51卷 2147-2168页

作者： Li, Shurong Han, Lu Ge, Yulei Shi, Yuhuan Beijing Univ Posts & Telecommun Coll Automat Beijing Peoples R China China Univ Petr East China Coll Informat & Control Engn Qingdao Shandong Peoples R China Qingdao Topscomm Commun Co Qingdao Shandong Peoples R China CNPC East China Design Inst Co Qingdao Shandong Peoples R China

An approximate dynamic programming algorithm based on an actor?critic framework is proposed in this article. This algorithm takes the actor?critic framework as the basic framework, in which the actor and the critic are used to approximate the optimal value function and the control strategy, respectively. At first, the linear basis function approximator is used to approximate the value function. Then the method of basis function construction based on system characteristics is introduced. Furthermore, since the injection concentration of ASP flooding has a fixed interval, the action weighting method is adopted to restrict and approximate the optimal control action. The value function parameter and the two strategy parameters are updated by the gradient descent method. Meanwhile the eligibility trace is introduced to accelerate convergence. Finally, ASP flooding with four injection wells and nine production wells is used to test the effect of the proposed method.

关键词： approximate dynamic programming actor?critic framework ASP flooding action weighting eligibility trace

来源：评论

学校读者我要写书评

暂无评论

Opportunistic Fair Scheduling in Wireless Networks: An approximate dynamic programming Approach

引用

MOBILE NETWORKS & APPLICATIONS 2010年第5期15卷 710-728页

作者： Zhang, Zhi Moola, Sudhir Chong, Edwin K. P. Colorado State Univ Dept Elect & Comp Engn Ft Collins CO 80523 USA

We consider the problem of temporal fair scheduling of queued data transmissions in wireless heterogeneous networks. We deal with both the throughput maximization problem and the delay minimization problem. Taking fairness constraints and the data arrival queues into consideration, we formulate the transmission scheduling problem as a Markov decision process (MDP) with fairness constraints. We study two categories of fairness constraints, namely temporal fairness and utilitarian fairness. We consider two criteria: infinite horizon expected total discounted reward and expected average reward. Applying the dynamic programming approach, we derive and prove explicit optimality equations for the above constrained MDPs, and give corresponding optimal fair scheduling policies based on those equations. A practical stochastic-approximation-type algorithm is applied to calculate the control parameters online in the policies. Furthermore, we develop a novel approximation method-temporal fair rollout-to achieve a tractable computation. Numerical results show that the proposed scheme achieves significant performance improvement for both throughput maximization and delay minimization problems compared with other existing schemes.

关键词： approximate dynamic programming fairness Markov decision process resource allocation scheduling stochastic approximation wireless networks

来源：评论

学校读者我要写书评

暂无评论

Volume-weighted Bellman error method for adaptive meshing in approximate dynamic programming

引用

REVISTA IBEROAMERICANA DE AUTOMATICA E INFORMATICA INDUSTRIAL 2022年第1期19卷 37-47页

作者： Armesto, Leopoldo Sala, Antonio Univ Politecn Valencia Inst Diseno & Fabricac Cno Vera S-N Valencia 46022 Spain Univ Politecn Valencia Inst Univ Automat & Informat Ind Cno Vera S-N Valencia 46022 Spain

Y Optimal control and reinforcement learning have an associate "value function" which must be suitably approximated. Value function approximation problems usually have different precision requirements in different regions of the state space. An uniform gridding wastes resources in regions in which the value function is smooth, and, on the other hand, has not enough resolution in zones with abrupt changes. The present work proposes an adaptive meshing methodology in order to adapt to these changing requirements without incrementing too much the number of parameters of the approximator. The proposal is based on simplicial meshes and Bellman error, with a criteria to add and remove points from the mesh: modifications to proposals in earlier literature including the volume of the affected simplices are proposed, alongside with methods to manipulate the mesh triangulation.

关键词： Intelligent control approximate dynamic programming optimal control neural learning

来源：评论

学校读者我要写书评

暂无评论

An approximate dynamic programming approach for collaborative caching

引用

ENGINEERING OPTIMIZATION 2021年第6期53卷 1005-1023页

作者： Yang, Xinan Thomos, Nikolaos Univ Essex Dept Math Sci Colchester Essex England Univ Essex Sch Comp Sci & Elect Engn Colchester Essex England

In this article, online collaborative content caching in wireless networks is studied from a network economics point of view. The cache optimization problem is first modelled as a finite horizon Markov decision process that incorporates an auto-regressive model to forecast the evolution of the content demands. The complexity of the problem grows exponentially with the system parameters, and even though a good approximation to the cost-to-go can be found, the single-stage decision problem is still NP-hard. To deal with cache optimization in industrial-size networks, a novel methodology called rolling horizon is proposed that solves the dimensionality of the problem by freezing the cache decisions for a short number of periods to construct a value function approximation. Then, to address the NP-hardness of the single-stage decision problem, two simplifications/reformulations are examined: (a) to limit the number of content replicas in the network and (b) to limit the allowed content replacements. The results show that the proposed approach can reduce the communication cost by over 84% compared to that of running least recently used updates on offline schemes in collaborative caching. The results also shed light on the trade-off between the efficiency of the caching policy and the time needed to run the online cache optimization algorithm.

关键词： Collaborative caching online offline caching popularity dynamics finite horizon MDP approximate dynamic programming

来源：评论

学校读者我要写书评

暂无评论

A cost-shaping linear program for average-cost approximate dynamic programming with performance guarantees

引用

MATHEMATICS OF OPERATIONS RESEARCH 2006年第3期31卷 597-620页

作者： de Farias, Daniela Pucci Van Roy, Benjamin MIT Dept Mech Engn Cambridge MA 02139 USA Stanford Univ Dept Management Sci & Engn Stanford CA 94305 USA Stanford Univ Dept Elect Engn Stanford CA 94305 USA

We introduce a new algorithm based on linear programming for optimization of average-cost Markov decision processes (MDPs). The algorithm approximates the differential cost function of a perturbed MDP via a linear combination of basis functions. We establish a bound on the performance of the resulting policy that scales gracefully with the number of states without imposing the strong Lyapunov condition required by its counterpart in de Farias and Van Roy (de Farias, D. R, B. Van Roy. 2003. The linear programming approach to approximate dynamic programming. Oper Res. 51(6) 850-865]. We investigate implications of this result in the context of a queueing control problem.

关键词： approximate dynamic programming linear programming average cost

来源：评论

学校读者我要写书评

暂无评论

Low-discrepancy sampling for approximate dynamic programming with local approximators

引用

COMPUTERS & OPERATIONS RESEARCH 2014年第1期43卷 108-115页

作者： Cervellera, C. Gaggero, M. Maccio, D. CNR Inst Intelligent Syst Automat I-16149 Genoa Italy

approximate dynamic programming (ADP) relies, in the continuous-state case, on both a flexible class of models for the approximation of the value functions and a smart sampling of the state space for the numerical solution of the recursive Bellman equations. In this paper, low-discrepancy sequences, commonly employed for number-theoretic methods, are investigated as a sampling scheme in the ADP context when local models, such as the Nadaraya Watson (NW) ones, are employed for the approximation of the value function. The analysis is carried out both from a theoretical and a practical point of view. In particular, it is shown that the combined use of low-discrepancy sequences and NW models enables the convergence of the ADP procedure. Then, the regular structure of the low-discrepancy sampling is exploited to derive a method for automatic selection of the bandwidth of NW models, which yields a significant saving in the computational effort with respect to the standard cross validation approach. Simulation results concerning an inventory management problem are presented to show the effectiveness of the proposed techniques. (C) 2013 Elsevier Ltd. All rights reserved.

关键词： approximate dynamic programming Low-discrepancy sampling Local approximation Nadaraya-Watson models Inventory forecasting

来源：评论

学校读者我要写书评

暂无评论

New integer optimization models and an approximate dynamic programming algorithm for the lot-sizing and scheduling problem with sequence-dependent setups

引用

EUROPEAN JOURNAL OF OPERATIONAL RESEARCH 2022年第1期302卷 230-243页

作者： Lee, Younsoo Lee, Kyungsik Seoul Natl Univ Dept Ind Engn 1 Gwanak ro Seoul 08826 South Korea

In this paper, we propose new integer optimization models for the lot-sizing and scheduling problem with sequence-dependent setups, based on the general lot-sizing and scheduling problem. To incorporate setup crossover and carryover, we first propose a standard model that straightforwardly adapts a formulation technique from the literature. Then, as the main contribution, we propose a novel optimization model that incorporates the notion of time flow. We derive a family of valid inequalities with which to compare the tightness of the models' linear programming relaxations. In addition, we provide an approximate dynamic programming algorithm that estimates the value of a state using its lower and upper bounds. Then, we conduct computational experiments to demonstrate the competitiveness of the proposed models and the solution algorithm. The test results show that the newly proposed time-flow model has considerable advantages compared with the standard model in terms of tightness and solvability. The proposed algorithm also shows computational benefits over the standard mixed integer programming solver.

关键词： Production Lot-sizing and scheduling problem Integer optimization model Sequence-dependent setup approximate dynamic programming algorithm

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：