We assess the potentials of the approximate dynamic programming (ADP) approach for process control, especially as a method to complement the model predictive control (MPC) approach. In the artificial intelligence (AI)...
详细信息
We assess the potentials of the approximate dynamic programming (ADP) approach for process control, especially as a method to complement the model predictive control (MPC) approach. In the artificial intelligence (AI) and operations research (OR) research communities, ADP has recently seen significant activities as an effective method for solving Markov decision processes (MDPs), which represent a type of multi-stage decision problems under uncertainty. Process control problems are similar to MDPs with the key difference being the continuous state and action spaces as opposed to discrete ones. In addition, unlike in other popular ADP application areas like robotics or games, in process control applications first and foremost concern should be on the safety and economics of the on-going operation rather than on efficient learning. We explore different options within ADP design, such as the pre-decision state vs. post-decision state value function, parametric vs. nonparametric value function approximator, batch-mode vs. continuous-mode learning, and exploration vs. robustness. We argue that ADP possesses great potentials, especially for obtaining effective control policies for stochastic constrained nonlinear or linear systems and continually improving them towards optimality. (C) 2010 Elsevier Ltd. All rights reserved.
Three novel approximate dynamic programming algorithms based on the temporal, spatial, and spatiotemporal decomposition are proposed for the economic dispatch problem (EDP) in a distribution energy system with complex...
详细信息
Three novel approximate dynamic programming algorithms based on the temporal, spatial, and spatiotemporal decomposition are proposed for the economic dispatch problem (EDP) in a distribution energy system with complex topology and many non-dispatchable renewable energy sources and energy storage systems (ESS). Computational efficiency of the proposed algorithms is compared and convergence to the optimal solution is shown in numeric experiments on the example of the two-day hourly EDP for the IEEE 33bw test network having 200+ consumers, 150+ energy storages, and 1000+ consuming devices. Copyright (C) 2022 The Authors.
Intermittent electricity generation from renewable sources is characterized by a wide range of fluctuations in frequency spectrum. The medium-frequency component of 0.01 Hz - 1 Hz cannot be filtered out by system iner...
详细信息
ISBN:
(纸本)9781479964154
Intermittent electricity generation from renewable sources is characterized by a wide range of fluctuations in frequency spectrum. The medium-frequency component of 0.01 Hz - 1 Hz cannot be filtered out by system inertia and automatic generation control (AGC) and thus it results in deterioration of frequency quality. In this paper, an approximate dynamic programming (ADP) based supplementary frequency controller for thermal generators is developed to attenuate renewable generation fluctuation in medium-frequency range. A policy iteration based training algorithm is employed for online and model-free learning. Our simulation results demonstrate that the proposed supplementary frequency controller can effectively adapt to changes in the system and provide improved frequency control. Further sensitivity analysis validates that the supplementary frequency controller significantly attenuates the dependence of frequency deviation on the medium-frequency component of renewable generation fluctuation.
The United States Air Force (USAF) makes officer accession and promotion decisions annually. Optimal manpower planning of the commissioned officer corps is vital to ensuring a well-balanced manpower system. A manpower...
详细信息
The United States Air Force (USAF) makes officer accession and promotion decisions annually. Optimal manpower planning of the commissioned officer corps is vital to ensuring a well-balanced manpower system. A manpower system that is neither over-manned nor under-manned is desirable as it is most cost effective. The Air Force Officer Manpower Planning Problem (AFO-MPP) is introduced, which models officer accessions, promotions, and the uncertainty in retention rates. The objective for the AFO-MPP is to identify the policy for accession and promotion decisions that minimizes expected total discounted cost of maintaining the required number of officers in the system over an infinite time horizon. The AFO-MPP is formulated as an infinite-horizon Markov decision problem, and a policy is found using approximate dynamic programming. A least-squares temporal differencing (LSTD) algorithm is employed to determine the best approximate policies. Six computational experiments are conducted with varying retention rates and officer manning starting conditions. The policies determined by the LSTD algorithm are compared to the benchmark policy, which is the policy currently practiced by the USAF. Results indicate that when the manpower system is in a starting state with on-target numbers of officers per rank, the ADP policy outperforms the benchmark policy. When the starting state is unbalanced, with more officers in junior ranking positions, the benchmark policy outperforms the ADP policy. When the starting state is unbalanced, with more officers in senior ranking positions, there is not statistical difference between the ADP and benchmark policy. In this starting state, ADP policy has smaller variance, indicating the ADP policy is more dependable than the benchmark policy.
Multistate stochastic programs pose some of the more challenging optimization problems. Because such models can become rather intractable in general, it is important to design algorithms that can provide approximation...
详细信息
Multistate stochastic programs pose some of the more challenging optimization problems. Because such models can become rather intractable in general, it is important to design algorithms that can provide approximations which, in the long run, yield solutions that are arbitrarily close to an optimum. In this paper, we propose such a sequential sampling method which is applicable to multistage stochastic linear programs, and we refer to it as the multistage stochastic decomposition (MSD) algorithm. This algorithm represents a dynamic extension of a regularized version of stochastic decomposition (SD). While the method allows general correlation structures, specialized streamlined versions are also possible for special cases of stagewise independent and autoregressive processes commonly incorporated in stochastic programming. As with its two-stage counterpart, the MSD algorithm is shown to provide an asymptotically optimal solution, with probability one. As a by-product of this study, we also show that SD algorithms draw upon features of both approximate dynamic programming as well as stochastic programming.
Internet Service Providers (ISPs) have the ability to route their traffic over different network providers. This study investigates the optimal routing strategy under multihoming in the case where network providers ch...
详细信息
Internet Service Providers (ISPs) have the ability to route their traffic over different network providers. This study investigates the optimal routing strategy under multihoming in the case where network providers charge ISPs according to top-percentile pricing (i.e. based on the theta th highest volume of traffic shipped). We call this problem the Top-percentile Traffic Routing Problem (TpTRP). The TpTRP is a multistage stochastic optimization problem. Routing decision for every time period should be made before knowing the amount of traffic that is to be sent. The stochastic nature of the problem forms the critical difficulty of this study. Solution approaches based on Stochastic Integer programming or Stochastic dynamicprogramming (SDP) suffer from the curse of dimensionality, which restricts their applicability. To overcome this, we suggest to use approximate dynamic programming, which exploits the structure of the problem to construct continuous approximations of the value functions in SDP. Thus, the curse of dimensionality is largely avoided.
Growing penetration of renewable distributed generation, a major concern nowadays, has played a critical role in distribution system operation. This paper develops a state-based sequential network reconfiguration stra...
详细信息
Growing penetration of renewable distributed generation, a major concern nowadays, has played a critical role in distribution system operation. This paper develops a state-based sequential network reconfiguration strategy by using a Markov decision process (MDP) model with the objective of minimizing renewable distributed generation curtailment and load shedding under operational constraints. Available power outputs of distributed generators and the system topology in each decision time are represented as Markov states, which are driven to other Markov states in next decision time in consideration of uncertainties of renewable distributed generation. For each Markov state in each decision time, a recursive optimization model with a current cost and a future cost is developed to make state-based actions, including system reconfiguration, load shedding, and distributed generation curtailment. To address the curse of dimensionality caused by enormous states and actions in the proposed model, an approximate dynamic programming (ADP) approach, including post-decision states and forward dynamic algorithm, is used to solve the proposed MDP-based model. IEEE 33-bus system and IEEE 123-bus system are used to validate the proposed model.
This paper focuses on economical operation of a microgrid (MG) in real-time. A novel dynamic energy management system is developed to incorporate efficient management of energy storage system into MG real-time dispatc...
详细信息
This paper focuses on economical operation of a microgrid (MG) in real-time. A novel dynamic energy management system is developed to incorporate efficient management of energy storage system into MG real-time dispatch while considering power flow constraints and uncertainties in load, renewable generation and real-time electricity price. The developed dynamic energy management mechanism does not require long-term forecast and optimization or distribution knowledge of the uncertainty, but can still optimize the long-term operational costs of MGs. First, the real-time scheduling problem is modeled as a finite-horizon Markov decision process over a day. Then, approximate dynamic programming and deep recurrent neural network learning are employed to derive a near optimal real-time scheduling policy. Last, using real power grid data from California independent system operator, a detailed simulation study is carried out to validate the effectiveness of the proposed method.
We propose an approximate algorithm to dynamically assign a multi-skilled workforce to the stations of a job shop, with demand uncertainty and variability in the availability of the resources, to maximize *** proposed...
详细信息
We propose an approximate algorithm to dynamically assign a multi-skilled workforce to the stations of a job shop, with demand uncertainty and variability in the availability of the resources, to maximize *** proposed model is inspired by automotive glass manufacturing, where maximizing the surface area of manufactured safety glass during a given time frame is the key performance measure. We first develop the model of a traditional job shop with a set of stations, each with a particular number of machines, with distinct production performance levels, according to their utilization stage. Each product type needs to be processed on a subset of these stations according to a predefined sequence. Customers place their orders independently over time, specifying the units required of each product type. The inter-arrival of orders (demand) and processing times are assumed to be stochastic. We also suppose that the techni-cians have varied skill sets, according to which they can only work at a certain subgroup of stations, and variable availability depending on sick leave, vacations, etc. Hence, in order to maximize the predefined productivity index, the optimal assignment of technicians to the stations based on their skill sets and availability during each shift becomes a complex decision-making process. Given the stochastic and dynamic nature of this problem, we model the setting as a Markov Decision Process (MDP).Given its size, we propose to solve it using approximate dynamic programming (ADP). We address the exponential growth of the action space by using a hill-climbing algorithm for action selection. To show the performance and effectiveness of the proposed algorithm, we use real company data and compare the results of the algorithm with the current policy in use, as well as other proposed policies. Applying our proposed method resulted in an average improvement of 15% in productivity compared to the best performing benchmark policy.(c) 2022 Elsevier B.V.
暂无评论