This work shows different strategies for a Robot to learn the optimal operation of a diverse electrical energy generation system including resources such as thermal, hydroelectric, wind, solar generators and energy ac...
详细信息
ISBN:
(纸本)9781665424431
This work shows different strategies for a Robot to learn the optimal operation of a diverse electrical energy generation system including resources such as thermal, hydroelectric, wind, solar generators and energy accumulators. The large number of variables in these systems results in a huge state space. Thus, computing an explicit representation of the cost function over said space, which is at the heart of most current optimization methods, becomes infeasible. The strategies presented here aim at solving the aforementioned problem by learning an implicit representation of the cost function over the state space. Another key idea is to keep the complexity of the representation at a minimum, in order to obtain a solution which captures the most relevant characteristics of the cost-to-go of the system, with the least possible parameters.
This work considers (deep) artificial feed-forward neural networks as parametric approximators in optimal control of discrete-time switched linear systems with controlled switching. The proposed approach is based on a...
详细信息
ISBN:
(纸本)9789897585227
This work considers (deep) artificial feed-forward neural networks as parametric approximators in optimal control of discrete-time switched linear systems with controlled switching. The proposed approach is based on approximate dynamic programming and allows the fast computation of (sub-)optimal discrete and continuous control inputs, either by approximating the optimal cost-to-go functions or by approximating the optimal discrete and continuous input policies. An important property of the approach is the satisfaction of polytopic state and input constraints, which is crucial for ensuring safety, as required in many control applications. A numeric example is provided for illustration and evaluation of the approaches.
Attended home delivery requires offering narrow delivery time slots for online booking. Given a fixed fleet of delivery vehicles and uncertainty about the value of potential future customers, retailers have to decide ...
详细信息
The recent integration of renewable resources in electricity markets has increased the need for producers to correct their trading position close to real time in order to avoid volatile real-time prices. The last mark...
详细信息
ISBN:
(纸本)9781665435970
The recent integration of renewable resources in electricity markets has increased the need for producers to correct their trading position close to real time in order to avoid volatile real-time prices. The last market to close before delivery is the Continuous Intraday Market. Therefore, this market is an interesting outlet for renewable units that aim at covering their forecast errors. As a starting point for tackling this problem, we characterize an optimal policy for trading a fixed quantity in a simplified market model. We use this analytical solution as a basis for developing an approximate dynamic programming algorithm and an alternative Stochastic Dual dynamicprogramming that can trade under a more realistic set of assumptions.
The cyclic air braking strategy on the long steep downward slopes is one of the main challenges to the heavy haul train control in China. To overcome this dilemma, this paper proposes an optimization method of cyclic ...
详细信息
ISBN:
(纸本)9781728191423
The cyclic air braking strategy on the long steep downward slopes is one of the main challenges to the heavy haul train control in China. To overcome this dilemma, this paper proposes an optimization method of cyclic air braking strategy based on the approximate dynamic programming (ADP) algorithm which can achieve low maintenance costs and high running efficiency on the premise of safe operation. The optimization problem is described considering the characteristics of the heavy haul railways in China. Then the cyclic air braking strategy on the long steep downward slopes is formalized as a Markov decision process (MDP) and the critical elements in the ADP methodology are introduced according to the constraints and optimization objectives. Further, the value-iteration based ADP approach is proposed to solve the optimization problem of cyclic air braking strategy. The simulation experiments are carried out with the real-world data of the Shuohuang Line to illustrate the effectiveness of the proposed approach.
Hamilton-Jacobi-Bellman (HJB) equation is the sufficient and necessary condition for continuous-time optimal control problem (OCP). Different from HJB equation in infinite horizon, finite-horizon HJB equation contains...
详细信息
ISBN:
(纸本)9781728190358
Hamilton-Jacobi-Bellman (HJB) equation is the sufficient and necessary condition for continuous-time optimal control problem (OCP). Different from HJB equation in infinite horizon, finite-horizon HJB equation contains a time-dependent value function, whose partial derivative with respect to time is an intractable unknown term. My study has found that the partial derivative exactly equals the terminal-time utility function by analyzing the initial-time equivalency between fixed time horizon OCP and fixed terminal time OCP. We also provide another proof, which uses the definition of partial derivative. This finding allows reusing traditional approximate dynamic programming (ADP) algorithm to approximate optimal policy with a parameterized function like neural network, thus solving the continuous-time finite-horizon OCP. The correctness of our finding is evaluated by analyzing a linear quadratic problem.
Internet Service Providers (ISPs) have the ability to route their traffic over different network providers. This study investigates the optimal routing strategy under multihoming in the case where network providers ch...
详细信息
Internet Service Providers (ISPs) have the ability to route their traffic over different network providers. This study investigates the optimal routing strategy under multihoming in the case where network providers charge ISPs according to top-percentile pricing (i.e. based on the theta th highest volume of traffic shipped). We call this problem the Top-percentile Traffic Routing Problem (TpTRP). The TpTRP is a multistage stochastic optimization problem. Routing decision for every time period should be made before knowing the amount of traffic that is to be sent. The stochastic nature of the problem forms the critical difficulty of this study. Solution approaches based on Stochastic Integer programming or Stochastic dynamicprogramming (SDP) suffer from the curse of dimensionality, which restricts their applicability. To overcome this, we suggest to use approximate dynamic programming, which exploits the structure of the problem to construct continuous approximations of the value functions in SDP. Thus, the curse of dimensionality is largely avoided.
This paper proposes an energy management strategy (EMS) for connected power-split hybrid electric vehicles (HEVs). In detail, long short-term memory networks (LSTM) is used to predict the future velocity trajectory fo...
详细信息
ISBN:
(纸本)9784907764739
This paper proposes an energy management strategy (EMS) for connected power-split hybrid electric vehicles (HEVs). In detail, long short-term memory networks (LSTM) is used to predict the future velocity trajectory for a few seconds of the preceding vehicle by utilizing the information provided by vehicle to vehicle (V2V) and vehicle to infrastructure (V2I). Then consumption of fuel and electricity are chosen as the cost function of the optimization. And model predictive control (MPC) approach is adopted to optimize the cost function. In order to solve the problem of dimension explosion, the cost-to-go function is optimized through approximate dynamic programming (ADP). Finally, the effectiveness of EMS is verified on the platform of MATLAB/Simulink.
Accounting for externalities generated by fire spread is necessary for managing fire risk on landscapes with multiple owners. In this paper, we determine the optimal management of a synthetic landscape parameterized t...
详细信息
Accounting for externalities generated by fire spread is necessary for managing fire risk on landscapes with multiple owners. In this paper, we determine the optimal management of a synthetic landscape parameterized to represent the ecological conditions of Douglas-fir (Pseudotsuga menziesii) plantations in southwest Oregon. The problem is formulated as a dynamic game, where each agent maximizes their own objective without considering the welfare of the other agents. We demonstrate a method for incorporating spatial information and externalities into a dynamic optimization process. A machine-learning technique, approximate dynamic programming, is applied to determine the optimal timing and location of fuel treatments and timber harvests for each agent. The value functions we estimate explicitly account for the spatial interactions that generate fire risk. They provide a way to model the expected benefits, costs, and externalities associated with management actions that have uncertain consequences in multiple locations. The method we demonstrate is applied to analyze the effect of landscape fragmentation on landowner welfare and ecological outcomes. Study Implications: This research builds on several important ideas for forest management. Fire risk for any particular stand on a landscape is a function of vegetation conditions across the entire landscape. Landowners who wish to achieve a management objective that is affected by fire risk need to account for the risk generated by broader landscape conditions. This work expands on a tractable model to account for the spatial interactions generated by fire spread that affect the optimal timing and spatial location of timber harvest and fuel treatments. In this paper, we demonstrate that optimal behavior changes when there are multiple landowners. On a sufficiently fragmented landscape, one landowner's actions can create additional risk for their neighbors. This work suggests that policy interventions to incentivize risk
Following the occurrence of an extreme natural or man-made event, community recovery management should aim at providing optimal restoration policies for a community over a planning horizon. Calculating such optimal re...
详细信息
Following the occurrence of an extreme natural or man-made event, community recovery management should aim at providing optimal restoration policies for a community over a planning horizon. Calculating such optimal restoration policies in the presence of uncertainty poses significant challenges for community leaders. Stochastic scheduling for several interdependent infrastructure systems is a difficult control problem with huge decision spaces. The Markov decision process (MDP)-based optimization approach proposed in this study incorporates different sources of uncertainties to compute the restoration policies. The computation of optimal scheduling presented herein employs the rollout algorithm, which provides an effective computational tool for optimization problems dealing with real-world large-scale networks and communities. The proposed methodology is applied to a realistic community recovery problem, where different decision-making objectives are considered. The approach accommodates current restoration strategies employed in recovery management;computational results indicate that the restoration policies identified herein significantly outperform the current recovery strategies. Finally, the applicability of the method to address different risk attitudes of policymakers, which include risk-neutral and risk-averse attitudes in the community recovery management, is examined.
暂无评论