The network revenue management (RM) problem arises in airline, hotel, media, and other industries where the sale products use multiple resources. It can be formulated as a stochastic dynamic program, but the dynamic p...
详细信息
The network revenue management (RM) problem arises in airline, hotel, media, and other industries where the sale products use multiple resources. It can be formulated as a stochastic dynamic program, but the dynamic program is computationally intractable because of an exponentially large state space, and a number of heuristics have been proposed to approximate its value function. In this paper we show that the piecewise-linear approximation to the network RM dynamic program is tractable;specifically we show that the separation problem of the approximation can be solved as a relatively compact linear program. Moreover, the resulting compact formulation of the approximatedynamic program turns out to be exactly equivalent to the Lagrangian relaxation of the dynamic program, an earlier heuristic method proposed for the same problem. We perform a numerical comparison of solving the problem by generating separating cuts or as our compact linear program. We discuss extensions to versions of the network RM problem with overbooking as well as the difficulties of extending it to the choice model of network revenue RM.
We discuss the computational complexity and feasibility properties of scenario sampling techniques for uncertain optimization programs. We propose an alternative way of dealing with a special class of stage wise coupl...
详细信息
We discuss the computational complexity and feasibility properties of scenario sampling techniques for uncertain optimization programs. We propose an alternative way of dealing with a special class of stage wise coupled programs and compare it with existing methods in the literature in terms of feasibility and computational complexity. We identify trade-offs between different methods depending on the problem structure and the desired probability of constraint satisfaction. To illustrate our results, an example from the area of approximate dynamic programming is considered. (C) 2016 Elsevier B.V. All rights reserved.
This paper addresses the non-preemptive single machine scheduling problem to minimize total tardiness. We are interested in the online version of this problem, where orders arrive at the system at random times. Jobs h...
详细信息
This paper addresses the non-preemptive single machine scheduling problem to minimize total tardiness. We are interested in the online version of this problem, where orders arrive at the system at random times. Jobs have to be scheduled without knowledge of what jobs will come afterwards. The processing times and the due dates become known when the order is placed. The order release date occurs only at the beginning of periodic intervals. A customized approximate dynamic programming method is introduced for this problem. The authors also present numerical experiments that assess the reliability of the new approach and show that it performs better than a myopic policy.
We show that a convex relaxation, introduced by Sridharan, McEneaney, Gu and James to approximate the value function of an optimal control problem arising from quantum gate synthesis, is exact. This relaxation applies...
详细信息
We show that a convex relaxation, introduced by Sridharan, McEneaney, Gu and James to approximate the value function of an optimal control problem arising from quantum gate synthesis, is exact. This relaxation applies to the maximization of a class of concave piecewise affine functions over the unitary group.
This paper proposes a dynamic vehicle routing problem (DVRP) model with nonstationary stochastic travel times under traffic congestion. Depending on the traffic conditions, the travel time between two nodes, particula...
详细信息
This paper proposes a dynamic vehicle routing problem (DVRP) model with nonstationary stochastic travel times under traffic congestion. Depending on the traffic conditions, the travel time between two nodes, particularly in a city, may not be proportional to distance and changes both dynamically and stochastically over time. Considering this environment, we propose a Markov decision process model to solve this problem and adopt a rollout-based approach to the solution, using approximate dynamic programming to avoid the curse of dimensionality. We also investigate how to estimate the probability distribution of travel times of arcs which, reflecting reality, are considered to consist of multiple road segments. Experiments are conducted using a real-world problem faced by Singapore logistics/delivery company and authentic road traffic information.
In this paper, we focus on the problems of load scheduling and power trading in systems with high penetration of renewable energy resources (RERs). We adopt approximate dynamic programming to schedule the operation of...
详细信息
In this paper, we focus on the problems of load scheduling and power trading in systems with high penetration of renewable energy resources (RERs). We adopt approximate dynamic programming to schedule the operation of different types of appliances including must-run and controllable appliances. We assume that users can sell their excess power generation to other users or to the utility company. Since it is more profitable for users to trade energy with other users locally, users with excess generation compete with each other to sell their respective extra power to their neighbors. A game theoretic approach is adopted to model the interaction between users with excess generation. In our system model, each user aims to obtain a larger share of the market and to maximize its revenue by appropriately selecting its offered price and generation. In addition to yielding a higher revenue, consuming the excess generation locally reduces the reverse power flow, which impacts the stability of the system. Simulation results show that our proposed algorithm reduces the energy expenses of the users. The proposed algorithm also facilitates the utilization of RERs by encouraging users to consume excess generation locally rather than injecting it back into the power grid.
An online adaptive optimal control is proposed for continuous-time nonlinear systems with completely unknown dynamics, which is achieved by developing a novel identifier-critic-based approximate dynamic programming al...
详细信息
An online adaptive optimal control is proposed for continuous-time nonlinear systems with completely unknown dynamics, which is achieved by developing a novel identifier-critic-based approximate dynamic programming algorithm with a dual neural network (NN) approximation structure. First, an adaptive NN identifier is designed to obviate the requirement of complete knowledge of system dynamics, and a critic NN is employed to approximate the optimal value function. Then, the optimal control law is computed based on the information from the identifier NN and the critic NN, so that the actor NN is not needed. In particular, a novel adaptive law design method with the parameter estimation error is proposed to online update the weights of both identifier NN and critic NN simultaneously, which converge to small neighbourhoods around their ideal values. The closed-loop system stability and the convergence to small vicinity around the optimal solution are all proved by means of the Lyapunov theory. The proposed adaptation algorithm is also improved to achieve finite-time convergence of the NN weights. Finally, simulation results are provided to exemplify the efficacy of the proposed methods.
Despite a growing number of studies in stochastic dynamic network optimization, the field remains less well defined and unified than other areas of network optimization. Due to the need for approximation methods like ...
详细信息
Despite a growing number of studies in stochastic dynamic network optimization, the field remains less well defined and unified than other areas of network optimization. Due to the need for approximation methods like approximate dynamic programming, one of the most significant problems yet to be solved is the lack of adequate benchmarks. The values of the perfect information policy and static policy are not sensitive to information propagation while the myopic policy does not distinguish network effects in the value of flexibility. We propose a scalable reference policy value defined from theoretically consistent real option values based on sampled sequences, and estimate it using extreme value distributions. The reference policy is evaluated on an existing network instance with known sequences (Sioux Falls network from Chow and Regan 2011a): the Weibull distribution demonstrates good fit and sampling consistency with more than 200 samples. The reference policy is further applied in computational experiments with two other types of adaptive network design: a facility location and timing problem on the Simchi-Levi and Berman (1988) network, and Hyytia et al.'s (2012) dynamic dial-a-ride problem. The former experiment represents an application of a new problem class and use of the reference policy as an upper bound for evaluating sampled policies, which can reach 3 % gap with 350 samples. The latter experiment demonstrates that sensitivity to parameters may be greater than expected, particularly when benchmarked against the proposed reference policy.
In this paper, the robust decentralized stabilization of continuous-time uncertain nonlinear systems with multi control stations is developed using a neural network based online optimal control approach. The novelty l...
详细信息
In this paper, the robust decentralized stabilization of continuous-time uncertain nonlinear systems with multi control stations is developed using a neural network based online optimal control approach. The novelty lies in that the well-known adaptive dynamicprogramming method is extended to deal with the nonlinear feedback control problem under uncertain and large-scale environment. Through introducing an appropriate bounded function and defining a modified cost function, it can be observed that the decentralized optimal controller of the nominal system can achieve robust decentralized stabilization of original uncertain system. Then, a critic neural network is constructed for solving the modified Hamilton-Jacobi-Bellman equation corresponding to the nominal system in an online fashion. The weights of the critic network are tuned based on the standard steepest descent algorithm with an additional term provided to guarantee the boundedness of system states. The stability analysis of the closed-loop system is carried out via the Lyapunov approach. At last, two simulation examples are given to verify the effectiveness of the present control approach.
An adaptive optimal control algorithm for systems with uncertain dynamics is formulated under a Reinforcement Learning framework. An embedded exploratory component is included explicitly in the objective function of a...
详细信息
暂无评论