This article investigates the optimal control problem (OCP) for a class of discrete-time nonlinear systems with state constraints. First, to overcome the challenge caused by the constraints, the original constrained O...
详细信息
This article investigates the optimal control problem (OCP) for a class of discrete-time nonlinear systems with state constraints. First, to overcome the challenge caused by the constraints, the original constrained OCP is transformed into an unconstrained OCP by utilizing the system transformation technique. Second, a new cost function is designed to alleviate the effect of system transformation on the optimality of the original system. Further, a novel off-policy deterministic approximate dynamic programming (ADP) scheme is developed to obtain a near-optimal solution for the transformed OCP. Compared to existing off-policy deterministic ADP schemes, the developed scheme relaxes the requirement on the learning data and saves computing resources from the perspective of training neural networks. Third, considering approximation errors, we analyze the convergence and stability of the developed ADP scheme. Finally, the developed ADP with the designed cost function is tested in two numerical cases, and simulation results confirm its effectiveness.
Logistics related costs constitute a major part in total cost of a product in general. Considering a company that delivers goods to its customers using its owned fleet, fleet ownership and operational costs together w...
详细信息
Logistics related costs constitute a major part in total cost of a product in general. Considering a company that delivers goods to its customers using its owned fleet, fleet ownership and operational costs together with the inventory costs compose the total logistics costs. In this study, we suggest an approximate dynamic programming algorithm, with a look ahead strategy, that uses the fix and optimize method as the imbedded heuristic for solving integrated fleet composition and replenishment planning problem. The total annual distribution cost factors considered in the problem are vehicle ownership costs, approximate routing costs, and inventory related costs. In this problem, we aim to minimize the total logistic cost by optimizing the fleet composition, replenishment patterns, and customers assigned to each vehicle in the fleet. We produced a set of reasonably large instances randomly and showed the efficacy of the suggested solution method.
Enhancing control precision, mitigating external disturbances, and ensuring real-time responsiveness stand as the cornerstone of autonomous vehicle tracking endeavors, each of which intricately interwoven to uphold op...
详细信息
Enhancing control precision, mitigating external disturbances, and ensuring real-time responsiveness stand as the cornerstone of autonomous vehicle tracking endeavors, each of which intricately interwoven to uphold operational safety. In pursuit of addressing these issues, this paper presents a triple iterative control method inspired by approximate dynamic programming (ADP) tailored for real-time disturbance avoidance. The control framework orchestrates simultaneous iterations of value function, control policy, and disturbance policy, engineered to optimize tracking control amidst external disturbances cast as a zero-sum differential game, tackled adeptly through deep neural networks. Rigorous mathematical proof underpins its triple iteration, coupled with assurances of residual error convergence, solidifying its safety guarantee ability and algorithmic resilience. To validate its effectiveness, both numerical simulations and experiments on a real micro-vehicle platform were conducted. Results underscore the feasibility of this new method, showcasing its energy-saving capability and a four-times acceleration compared to conventional model predictive control (MPC) approaches when confronted with lateral disturbances. Notably, the single-step calculation time of this method on the Raspberry Pi is only 1.44ms, affirming its practical viability and real-world applicability.
We consider a long-term engine maintenance planning problem for an aircraft fleet. The objective is to guarantee sufficient on-wing engines to reach service levels while effectively organizing shop visits for engines....
详细信息
We consider a long-term engine maintenance planning problem for an aircraft fleet. The objective is to guarantee sufficient on-wing engines to reach service levels while effectively organizing shop visits for engines. However, complexity arises from intricate maintenance policies and uncertainty in engine deterioration. To address this problem, we propose a graph-based approach representing high-dimensional engine statuses and transitions. We then formulate the problem as a multi-stage stochastic integer program with endogenous uncertainty. We develop an approximate dynamic programming algorithm enhanced by dynamic graph generation and policy-sifting techniques so as to reduce the computational overhead in large problems. We demonstrate the efficacy of our method, compared with other popular methods, in terms of running time and solution quality. In the case study, we present an implementation in a real-world decision system in China Southern Airlines, in which the proposed method works seamlessly with other supporting modules and significantly improves the efficiency of engine maintenance management.
This paper focuses on optimizing the routing and charging schedules of an autonomous electric taxi (AET) system integrated with mobile charging services. In this system, a fleet of AETs provides on-demand ride service...
详细信息
This paper focuses on optimizing the routing and charging schedules of an autonomous electric taxi (AET) system integrated with mobile charging services. In this system, a fleet of AETs provides on-demand ride services for customers, while mobile charging vehicles (MCVs) are deployed as a flexible complement to fixed charging stations, offering fast charging options for AETs. A dynamicprogramming model is developed to optimize the joint operations of AETs and MCVs, considering stochastics in customer demand, AET energy consumption, and charging station resources. The objective is to maximize the operator's overall profit over the entire planning horizon, including revenues from serving customer requests, travel costs, charging costs, and penalties associated with both fleets. To address the stochastic and dynamic nature of the problem, an approximate dynamic programming (ADP) approach, incorporating customized pruning strategies to reduce the state and decision space, is proposed. This approach balances immediate operational gains with future potential profits. A series of numerical experiments have been conducted to evaluate the effectiveness of the proposed model and algorithm. Results show that the ADP-based policy significantly improves system performance compared to classical myopic benchmarks.
We study the patient assignment scheduling (PAS) problem in a random environment that arises in the management of patient flow in hospital systems, due to the stochastic nature of the arrivals as well as the length of...
详细信息
We study the patient assignment scheduling (PAS) problem in a random environment that arises in the management of patient flow in hospital systems, due to the stochastic nature of the arrivals as well as the length of stay (LoS) distribution. At the start of each time period, emergency patients in the waiting area of a hospital system need to be admitted to relevant wards. Decisions may involve allocation to less suitable wards, or transfers of the existing inpatients to accommodate higher priority cases when wards are at full capacity. However, the LoS for patients in non-primary wards may increase, potentially leading to long-term congestion. To assist with decision-making in this PAS problem, we construct a discrete-time Markov decision process over an infinite horizon, with multiple patient types and multiple wards. Since the instances of realistic size of this problem are not easy to solve, we develop numerical methods based on approximate dynamic programming. We demonstrate the application potential of our methodology under practical considerations with numerical examples, using parameters obtained from data at a tertiary referral hospital in Australia. We gain valuable insights, such as the number of patients in non-primary wards, the number of transferred patients, and the number of patients redirected to other facilities, under different policies that enhance the system's performance. This approach allows for more realistic assumptions and can also help determine the appropriate size of wards for different patient types within the hospital system.
We address a comprehensive ride-hailing system taking into account many of the decisions required to operate it in reality. The ride-hailing system is formed of a centrally managed fleet of autonomous electric vehicle...
详细信息
We address a comprehensive ride-hailing system taking into account many of the decisions required to operate it in reality. The ride-hailing system is formed of a centrally managed fleet of autonomous electric vehicles which is creating a transformative new technology with significant cost savings. This problem involves a dispatch problem for assigning riders to cars, a surge pricing problem for deciding on the price per trip and a planning problem for deciding on the fleet size. We use approximate dynamic programming to develop high-quality operational dispatch strategies to determine which car is best for a particular trip, when a car should be recharged, when it should be re-positioned to a different zone which offers a higher density of trips and when it should be parked. These decisions have to be made in the presence of a highly dynamic call-in process, and assignments have to take into consideration the spatial and temporal patterns in trip demand which are captured using value functions. We prove that the value functions are monotone in the battery and time dimensions and use hierarchical aggregation to get better estimates of the value functions with a small number of observations. Then, surge pricing is discussed using an adaptive learning approach to decide on the price for each trip. Finally, we discuss the fleet size problem. (C) 2020 Elsevier B.V. All rights reserved.
The flexibility of deployment strategies combined with the low cost of individual sensor nodes allow wireless sensor networks (WSNs) to be integrated into a variety of applications. Network operations degrade over tim...
详细信息
The flexibility of deployment strategies combined with the low cost of individual sensor nodes allow wireless sensor networks (WSNs) to be integrated into a variety of applications. Network operations degrade over time as sensors consume a finite power supply and begin to fail. In this work we address the selective maintenance of a WSN through a condition-based deployment policy (CBDP) in which sensors are deployed over a series of missions. The main contribution is a Markov decision process (MDP) model to maintain a reliable WSN with respect to region coverage. Due to the resulting high dimensional state and outcome space, we explore approximate dynamic programming (ADP) methodology in the search for high quality CBDPs. Our model is one of the first related to the selective maintenance of a large-scale WSN through the repeated deployment of new sensor nodes with a reliability objective, and one of the first ADP applications for the maintenance of a complex WSN. Additionally, our methodology incorporates a destruction spectrum reliability estimate which has received significant attention with respect to network reliability, but its value in a maintenance setting has not been widely explored. We conclude with a discussion on CBDPs in a range of test instances, and compare the performance to alternative deployment strategies.
This study proposes an approximate dynamic programming (ADP) scheme which solves approximately the continuous-time (CT) infinite horizon, linear quadratic (LQ) optimal control problems (OCPs) online for CT linear time...
详细信息
This study proposes an approximate dynamic programming (ADP) scheme which solves approximately the continuous-time (CT) infinite horizon, linear quadratic (LQ) optimal control problems (OCPs) online for CT linear time-invariant (LTI) systems whose model is not exactly given a priori. In order to relax the assumption of the perfectly known input-coupling matrix, a cheap OCP consisting of a dynamic controller and a modified quadratic performance index is formulated from the conventional LQ OCP. Then, the CT ADP technique based on policy iteration is embedded in the controller as an adaptive element for iteratively solving this cheap OCP in online fashion. By solving the cheap OCP, the near-optimal solution of the original LQ OCP can be obtained, which is proven in this study. The proposed scheme guarantees the stability and convergence to a near-optimal solution, and does not require the knowledge regarding system dynamics during the iterations. Finally, the simulation results are provided to verify the applicability and effectiveness of the proposed control scheme.
We study a deterministic maritime inventory routing problem with a long planning horizon. For instances with many ports and many vessels, mixed-integer linear programming (MIP) solvers often require hours to produce g...
详细信息
We study a deterministic maritime inventory routing problem with a long planning horizon. For instances with many ports and many vessels, mixed-integer linear programming (MIP) solvers often require hours to produce good solutions even when the planning horizon is 90 or 120 periods. Building on the recent successes of approximate dynamic programming (ADP) for road-based applications within the transportation community, we develop an ADP procedure to generate good solutions to these problems within minutes. Our algorithm operates by solving many small subproblems (one for each time period) and by collecting information about how to produce better solutions. Our main contribution to the ADP community is an algorithm that solves MIP subproblems and uses separable piecewise linear continuous, but not necessarily concave or convex, value function approximations and requires no off-line training. Our algorithm is one of the first of its kind for maritime transportation problems and represents a significant departure from the traditional methods used. In particular, whereas virtually all existing methods are "MIP-centric," i.e., they rely heavily on a solver to tackle a nontrivial MIP to generate a good or improving solution in a couple of minutes, our framework puts the effort on finding suitable value function approximations and places much less responsibility on the solver. Computational results illustrate that with a relatively simple framework, our ADP approach is able to generate good solutions to instances with many ports and vessels much faster than a commercial solver emphasizing feasibility and a popular local search procedure.
暂无评论