This research introduces the use of approximate dynamic programming to overcome a variety of limitations of distinct infrastructure management problem formulations. The form, as well as the parameters, of a model spec...
详细信息
This research introduces the use of approximate dynamic programming to overcome a variety of limitations of distinct infrastructure management problem formulations. The form, as well as the parameters, of a model specifying the long-term costs associated with alternate infrastructure maintenance policies are learned via simulation. The introduced methodology makes it possible to manage large heterogeneous networks of facilities related by budgetary restrictions and resource constraints as well as by dependencies in maintenance costs or deterioration. In addition, the methodology is particularly well suited to consideration of multiple types of infrastructure condition data at the same time, including continuous-valued data and relevant historical data. Introduced techniques will prove valuable when high-quality deterioration and cost estimation models are available but are ill suited for use in a Markov decision problem framework. Computational studies show that the introduced approach is able to find an optimal solution to a relatively simple infrastructure management problem, and is able to find increasingly good solutions to a more complex problem.
The main focus of this article is to present a proposal to solve, via UDUT factorisation, the convergence and numerical stability problems that are related to the covariance matrix ill-conditioning of the recursive le...
详细信息
The main focus of this article is to present a proposal to solve, via UDUT factorisation, the convergence and numerical stability problems that are related to the covariance matrix ill-conditioning of the recursive least squares (RLS) approach for online approximations of the algebraic Riccati equation (ARE) solution associated with the discrete linear quadratic regulator (DLQR) problem formulated in the actor-critic reinforcement learning and approximate dynamic programming context. The parameterisations of the Bellman equation, utility function and dynamic system as well as the algebra of Kronecker product assemble a framework for the solution of the DLQR problem. The condition number and the positivity parameter of the covariance matrix are associated with statistical metrics for evaluating the approximation performance of the ARE solution via RLS-based estimators. The performance of RLS approximators is also evaluated in terms of consistence and polarisation when associated with reinforcement learning methods. The used methodology contemplates realisations of online designs for DLQR controllers that is evaluated in a multivariable dynamic system model.
Inventory management of procurement system is decomposed into sub-problems according to the timescale of decisions: the long-term planning for ordering raw materials and the short-term scheduling for unloading the ord...
详细信息
Inventory management of procurement system is decomposed into sub-problems according to the timescale of decisions: the long-term planning for ordering raw materials and the short-term scheduling for unloading the orders. To ensure more sustainable and robust operation, different decision layers should be integrated (which is nature of multi-scale), and supply and demand uncertainty should be considered. In this study, the planning problem is formulated as a Markov decision process (MDP) to incorporate possible realizations of uncertainty into the decision-making process. The MDP planning model is integrated with a scheduling model expressed by a MILP (or closely approximated by a heuristic approach). Decision policies are obtained from solving the MDP problem through an exact value iteration, as well as an approximate approach intended to alleviate the computational challenges. We compare the results from applying them with those of a reference policy obtained without any rigorous integration with scheduling through benchmark problems. (C) 2016 Elsevier Ltd. All rights reserved.
We investigate a class of scheduling problems where dynamically and stochastically arriving appointment requests are either rejected or booked for future slots. A customer may cancel an appointment. A customer who doe...
详细信息
We investigate a class of scheduling problems where dynamically and stochastically arriving appointment requests are either rejected or booked for future slots. A customer may cancel an appointment. A customer who does not cancel may fail to show up. The planner may overbook appointments to mitigate the detrimental effects of cancellations and no-shows. A customer needs multiple renewable resources. The system receives a reward for providing service;and incurs costs for rejecting requests, appointment delays, and overtime. Customers are heterogeneous in all problem parameters. We provide a Markov decision process (MDP) formulation of these problems. Exact solution of this MDP is intractable. We show that this MDP has a weakly coupled structure that enables us to apply an approximate dynamic programming method rooted in Lagrangian relaxation, affine value function approximation, and constraint generation. We compare this method with a myopic scheduling heuristic on eighteen hundred problem instances. Our experiments show that there is a statistically significant difference in the performance of the two methods in 77% of these instances. Of these statistically significant instances, the Lagrangian method outperforms the myopic method in 97% of the instances. (C) 2015 Elsevier Ltd. All rights reserved.
Based on the mathematical model of Permanent magnet synchronous generator (PMSG), maximum wind power tracking control strategy without wind speed detection is analyzed and a controller based on cloud RBF neural networ...
详细信息
Based on the mathematical model of Permanent magnet synchronous generator (PMSG), maximum wind power tracking control strategy without wind speed detection is analyzed and a controller based on cloud RBF neural network and approximate dynamic programming is designed to track the maximum wind power point. Optimal power-speed curve and vector control principles are used to control the electromagnetic torque by approximate dynamic programming controller to adjust the voltage of stator, so the speed of wind turbine can be operated at the optimal speed corresponding to the best power point. Cloud RBF neural network is adopted as the function approximation structure of approximate dynamic programming, and it has the advantage of the fuzziness and randomness of cloud model. Simulation results show that the method can solve the optimal control problem of complex nonlinear system such as wind generation and track the maximum wind power point accurately. (C) 2015 Elsevier Ltd. All rights reserved.
This paper proposes a novel sensor scheduling scheme based on adaptive dynamicprogramming, which makes the sensor energy consumption and tracking error optimal over the system operational horizon for wireless sensor ...
详细信息
This paper proposes a novel sensor scheduling scheme based on adaptive dynamicprogramming, which makes the sensor energy consumption and tracking error optimal over the system operational horizon for wireless sensor networks with solar energy harvesting. Neural network is used to model the solar energy harvesting. Kalman filter estimation technology is employed to predict the target location. A performance index function is established based on the energy consumption and tracking error. Critic network is developed to approximate the performance index function. The presented method is proven to be convergent. Numerical example shows the effectiveness of the proposed approach.
In this paper, a novel value iteration adaptive dynamicprogramming (ADP) algorithm, called "generalized value iteration ADP" algorithm, is developed to solve infinite horizon optimal tracking control proble...
详细信息
In this paper, a novel value iteration adaptive dynamicprogramming (ADP) algorithm, called "generalized value iteration ADP" algorithm, is developed to solve infinite horizon optimal tracking control problems for a class of discrete-time nonlinear systems. The developed generalized value iteration ADP algorithm permits an arbitrary positive semi-definite function to initialize it, which overcomes the disadvantage of traditional value iteration algorithms. Convergence property is developed to guarantee that the iterative performance index function will converge to the optimum. Neural networks are used to approximate the iterative performance index function and compute the iterative control policy, respectively, to implement the iterative ADP algorithm. Finally, a simulation example is given to illustrate the performance of the developed algorithm.
We present and examine a novel method for obtaining solutions to specific discrete-time optimal control problems. Our approach is based on linear state dynamics and convexity assumptions commonly satisfied in practica...
详细信息
We present and examine a novel method for obtaining solutions to specific discrete-time optimal control problems. Our approach is based on linear state dynamics and convexity assumptions commonly satisfied in practical applications. We show that the important class of optimal switching problems under partial observation is covered by our methodology, and we exploit specific model features to achieve simple algorithmic form of a numerical solution.
This paper provides a new idea for approximating the inventory cost function to be used in a truncated dynamic program for solving the capacitated lot-sizing problem. The proposed method combines dynamicprogramming w...
详细信息
This paper provides a new idea for approximating the inventory cost function to be used in a truncated dynamic program for solving the capacitated lot-sizing problem. The proposed method combines dynamicprogramming with regression, data fitting, and approximation techniques to estimate the inventory cost function at each stage of the dynamic program. The effectiveness of the proposed method is analyzed on various types of the capacitated lot-sizing problem instances with different cost and capacity characteristics. Computational results show that approximation approaches could significantly decrease the computational time required by the dynamic program and the integer program for solving different types of the capacitated lot-sizing problem instances. Furthermore, in most cases, the proposed approximate dynamic programming approaches can accurately capture the optimal solution of the problem with consistent computational performance over different instances.
It is sometimes challenging to plan winter maintenance operations in advance because snow storms are stochastic with respect to, e.g., start time, duration, impact area, and severity. In addition, maintenance trucks m...
详细信息
It is sometimes challenging to plan winter maintenance operations in advance because snow storms are stochastic with respect to, e.g., start time, duration, impact area, and severity. In addition, maintenance trucks may not be readily available at all times due to stochastic service disruptions. A stochastic dynamic fleet management model is developed to assign available trucks to cover uncertain snow plowing demand. The objective is to simultaneously minimize the cost for truck deadheading and repositioning, as well as to maximize the benefits (i.e., level of service) of plowing. The problem is formulated into a dynamicprogramming model and solved using an approximate dynamic programming algorithm. Piecewise linear functional approximations are used to estimate the value function of system states (i.e., snow plow trucks location over time). We apply our model and solution approach to a snow plow operation scenario for Lake County, Illinois. Numerical results show that the proposed algorithm can solve the problem effectively and outperforms a rolling-horizon heuristic solution.
暂无评论