This paper proposes a dynamic vehicle routing problem (DVRP) model with nonstationary stochastic travel times under traffic congestion. Depending on the traffic conditions, the travel time between two nodes, particula...
详细信息
This paper proposes a dynamic vehicle routing problem (DVRP) model with nonstationary stochastic travel times under traffic congestion. Depending on the traffic conditions, the travel time between two nodes, particularly in a city, may not be proportional to distance and changes both dynamically and stochastically over time. Considering this environment, we propose a Markov decision process model to solve this problem and adopt a rollout-based approach to the solution, using approximate dynamic programming to avoid the curse of dimensionality. We also investigate how to estimate the probability distribution of travel times of arcs which, reflecting reality, are considered to consist of multiple road segments. Experiments are conducted using a real-world problem faced by Singapore logistics/delivery company and authentic road traffic information.
In this paper, we focus on the problems of load scheduling and power trading in systems with high penetration of renewable energy resources (RERs). We adopt approximate dynamic programming to schedule the operation of...
详细信息
In this paper, we focus on the problems of load scheduling and power trading in systems with high penetration of renewable energy resources (RERs). We adopt approximate dynamic programming to schedule the operation of different types of appliances including must-run and controllable appliances. We assume that users can sell their excess power generation to other users or to the utility company. Since it is more profitable for users to trade energy with other users locally, users with excess generation compete with each other to sell their respective extra power to their neighbors. A game theoretic approach is adopted to model the interaction between users with excess generation. In our system model, each user aims to obtain a larger share of the market and to maximize its revenue by appropriately selecting its offered price and generation. In addition to yielding a higher revenue, consuming the excess generation locally reduces the reverse power flow, which impacts the stability of the system. Simulation results show that our proposed algorithm reduces the energy expenses of the users. The proposed algorithm also facilitates the utilization of RERs by encouraging users to consume excess generation locally rather than injecting it back into the power grid.
An online adaptive optimal control is proposed for continuous-time nonlinear systems with completely unknown dynamics, which is achieved by developing a novel identifier-critic-based approximate dynamic programming al...
详细信息
An online adaptive optimal control is proposed for continuous-time nonlinear systems with completely unknown dynamics, which is achieved by developing a novel identifier-critic-based approximate dynamic programming algorithm with a dual neural network (NN) approximation structure. First, an adaptive NN identifier is designed to obviate the requirement of complete knowledge of system dynamics, and a critic NN is employed to approximate the optimal value function. Then, the optimal control law is computed based on the information from the identifier NN and the critic NN, so that the actor NN is not needed. In particular, a novel adaptive law design method with the parameter estimation error is proposed to online update the weights of both identifier NN and critic NN simultaneously, which converge to small neighbourhoods around their ideal values. The closed-loop system stability and the convergence to small vicinity around the optimal solution are all proved by means of the Lyapunov theory. The proposed adaptation algorithm is also improved to achieve finite-time convergence of the NN weights. Finally, simulation results are provided to exemplify the efficacy of the proposed methods.
Despite a growing number of studies in stochastic dynamic network optimization, the field remains less well defined and unified than other areas of network optimization. Due to the need for approximation methods like ...
详细信息
Despite a growing number of studies in stochastic dynamic network optimization, the field remains less well defined and unified than other areas of network optimization. Due to the need for approximation methods like approximate dynamic programming, one of the most significant problems yet to be solved is the lack of adequate benchmarks. The values of the perfect information policy and static policy are not sensitive to information propagation while the myopic policy does not distinguish network effects in the value of flexibility. We propose a scalable reference policy value defined from theoretically consistent real option values based on sampled sequences, and estimate it using extreme value distributions. The reference policy is evaluated on an existing network instance with known sequences (Sioux Falls network from Chow and Regan 2011a): the Weibull distribution demonstrates good fit and sampling consistency with more than 200 samples. The reference policy is further applied in computational experiments with two other types of adaptive network design: a facility location and timing problem on the Simchi-Levi and Berman (1988) network, and Hyytia et al.'s (2012) dynamic dial-a-ride problem. The former experiment represents an application of a new problem class and use of the reference policy as an upper bound for evaluating sampled policies, which can reach 3 % gap with 350 samples. The latter experiment demonstrates that sensitivity to parameters may be greater than expected, particularly when benchmarked against the proposed reference policy.
In this paper, the robust decentralized stabilization of continuous-time uncertain nonlinear systems with multi control stations is developed using a neural network based online optimal control approach. The novelty l...
详细信息
In this paper, the robust decentralized stabilization of continuous-time uncertain nonlinear systems with multi control stations is developed using a neural network based online optimal control approach. The novelty lies in that the well-known adaptive dynamicprogramming method is extended to deal with the nonlinear feedback control problem under uncertain and large-scale environment. Through introducing an appropriate bounded function and defining a modified cost function, it can be observed that the decentralized optimal controller of the nominal system can achieve robust decentralized stabilization of original uncertain system. Then, a critic neural network is constructed for solving the modified Hamilton-Jacobi-Bellman equation corresponding to the nominal system in an online fashion. The weights of the critic network are tuned based on the standard steepest descent algorithm with an additional term provided to guarantee the boundedness of system states. The stability analysis of the closed-loop system is carried out via the Lyapunov approach. At last, two simulation examples are given to verify the effectiveness of the present control approach.
Optimal control problems of stochastic switching type appear frequently when making decisions under uncertainty and are notoriously challenging from a computational viewpoint. Although numerous approaches have been su...
详细信息
Optimal control problems of stochastic switching type appear frequently when making decisions under uncertainty and are notoriously challenging from a computational viewpoint. Although numerous approaches have been suggested in the literature to tackle them, typical real-world applications are inherently high dimensional and usually drive common algorithms to their computational limits. Furthermore, even when numerical approximations of the optimal strategy are obtained, practitioners must apply time-consuming and unreliable Monte Carlo simulations to assess their quality. In this paper, we show how one can overcome both difficulties for a specific class of discrete-time stochastic control problems. A simple and efficient algorithm which yields approximate numerical solutions is presented and methods to perform diagnostics are provided.
An adaptive optimal control algorithm for systems with uncertain dynamics is formulated under a Reinforcement Learning framework. An embedded exploratory component is included explicitly in the objective function of a...
详细信息
In this paper we discuss heuristics for network air cargo revenue management. We start from a dynamicprogramming formulation of air cargo network revenue management. Since the curse of dimensionality makes this probl...
详细信息
In this paper we discuss heuristics for network air cargo revenue management. We start from a dynamicprogramming formulation of air cargo network revenue management. Since the curse of dimensionality makes this problem intractable, we suggest several methods, based on linear programming, approximate dynamic programming, and decomposition to obtain both upper bounds and heuristics. We prove relationships between the upper bounds. Furthermore, we analyze the performance of both the bounds as well as the heuristics in a numerical study. In this numerical study, we find that a dynamicprogramming decomposition yields the tightest bounds. The heuristic based on the decomposition dominates other approaches by giving higher expected net revenues both when applied on the single-leg and on the network cargo problem.
This letter focuses on a scenario in which a team of sensing robots survey an area with predefined routes, and transmit the monitored information to a remote base station through a mobile relay. In this scenario, auto...
详细信息
This letter focuses on a scenario in which a team of sensing robots survey an area with predefined routes, and transmit the monitored information to a remote base station through a mobile relay. In this scenario, automatically adjusting the position of the mobile relay for maintaining wireless link quality while the sensing robots are moving is a challenging problem. In this letter, we consider the problem of minimizing the total energy consumption. We propose using dynamicprogramming (DP) and single-step optimization. By comparing the pros and cons of both methods, we propose a novel approximate optimal communication-motion planning (AOCMP) method based on approximate dynamic programming (ADP). Simulation results demonstrate that AOCMP may sharply decrease the computation time compared with DP, while performs better than single-step optimization which proves that AOCMP achieves a beneficial energy-complexity tradeoff for solving high-dimension problems.
A renewable power producer who trades on a day-ahead market sells electricity under supply and price uncertainty. Investments in energy storage mitigate the associated financial risks and allow for decoupling the timi...
详细信息
A renewable power producer who trades on a day-ahead market sells electricity under supply and price uncertainty. Investments in energy storage mitigate the associated financial risks and allow for decoupling the timing of supply and delivery. This paper introduces a model of the optimal bidding strategy for a hybrid system of renewable power generation and energy storage. We formulate the problem as a continuous-state Markov decision process and present a solution based on approximate dynamic programming. We propose an algorithm that combines approximate policy iteration with Least Squares Policy Evaluation (LSPE) which is used to estimate the weights of a polynomial value function approximation. We find that the approximate policies produce significantly better results for the continuous state space than an optimal discrete policy obtained by linear programming. A numerical analysis of the response surface of rewards on model parameters reveals that supply uncertainty, imbalance costs and a negative correlation of market price and supplies are the main drivers for investments in energy storage. Supply and price autocorrelation, on the other hand, have a negative effect on the value of storage.
暂无评论