The global tuberculosis (TB) control plan has historically emphasized passive case finding (PCF) as the most practical approach for identifying TB suspects in high burden settings. The success of this approach in cont...
详细信息
The global tuberculosis (TB) control plan has historically emphasized passive case finding (PCF) as the most practical approach for identifying TB suspects in high burden settings. The success of this approach in controlling TB depends on infectious individuals recognizing their symptoms and voluntarily seeking diagnosis rapidly enough to reduce onward transmission. It now appears, at least in some settings, that more intensified case-finding (ICF) approaches may be needed to control TB transmission;these more aggressive approaches for detecting as-yet undiagnosed cases obviously require additional resources to implement. Given that TB control programs are resource constrained and that the incremental yield of ICF is expected to wane over time as the pool of undiagnosed cases is depleted, a tool that can help policymakers to identify when to implement or suspend an ICF intervention would be valuable. In this article, we propose dynamic case-finding policies that allow policymakers to use existing observations about the epidemic and resource availability to determine when to switch between PCF and ICF to efficiently use resources to optimize population health. Using mathematical models of TB/HIV coepidemics, we show that dynamic policies strictly dominate static policies that prespecify a frequency and duration of rounds of ICF. We also find that the use of a diagnostic tool with better sensitivity for detecting smear-negative cases (e. g., Xpert MTB/RIF) further improves the incremental benefit of these dynamic case-finding policies.
This paper considers the problem of portfolio optimization in a market with partial information and discretely observed price processes. Partial information refers to the setting where assets have unobserved factors i...
详细信息
This paper considers the problem of portfolio optimization in a market with partial information and discretely observed price processes. Partial information refers to the setting where assets have unobserved factors in the rate of return and the level of volatility. Standard filtering techniques are used to compute the posterior distribution of the hidden variables, but there is difficulty in finding the optimal portfolio because the dynamicprogramming problem is non-Markovian. However, fast time scale asymptotics can be exploited to obtain an approximatedynamic program (ADP) that is Markovian and is therefore much easier to compute. Of consideration is a model where the latent variables (also referred to as hidden states) have fast mean reversion to an invariant distribution that is parameterized by a Markov chain theta(t), where theta(t) represents the regime-state of the market and reverts to its own invariant distribution over a much longer time scale. Data and numerical examples are also presented, and there appears to be evidence that unobserved drift results in an information premium.
Nonlinear systems under uncertainty are difficult to regulate with guaranteed stability and optimality. This study presents a switching control strategy, which consists of robust control Lyapunov function-based predic...
详细信息
ISBN:
(纸本)9788993215052
Nonlinear systems under uncertainty are difficult to regulate with guaranteed stability and optimality. This study presents a switching control strategy, which consists of robust control Lyapunov function-based predictive controller and approximate. dynamicprogramming-based controller. The former guarantees the robust stability within a level set, referred to as region of attraction (ROA). The latter improves optimality and reduces computational complexity in solving Bellman equation when the system is outside the ROA. The suggested approach is illustrated on a continuous stirred tank reactor example.
This paper is concerned with a new iterative adaptive dynamicprogramming (ADP) algorithm to solve optimal control problems for infinite horizon discrete-time nonlinear systems using a numerical controller. The conver...
详细信息
ISBN:
(纸本)9781479903801
This paper is concerned with a new iterative adaptive dynamicprogramming (ADP) algorithm to solve optimal control problems for infinite horizon discrete-time nonlinear systems using a numerical controller. The convergence conditions of the iterative ADP are developed considering the errors by the numerical controller which show that the iterative performance index functions can converge to the greatest lower bound of all performance indices within a finite error bound. Neural networks and digital computer are used to approximate the iterative performance index function and compute the numerically iterative control policy, respectively, for facilitating the implementation of the iterative ADP algorithm. Finally, a simulation example is given to illustrate the performance of the present method.
In this paper, the adaptive dynamicprogramming (ADP) approach is employed for designing an optimal controller of unknown discrete-time nonlinear systems with control constraints. A neural network is constructed for i...
详细信息
In this paper, the adaptive dynamicprogramming (ADP) approach is employed for designing an optimal controller of unknown discrete-time nonlinear systems with control constraints. A neural network is constructed for identifying the unknown dynamical system with stability proof. Then, the iterative ADP algorithm is developed to solve the optimal control problem with convergence analysis. Two other neural networks are introduced for approximating the cost function and its derivatives and the control law, under the framework of globalized dual heuristic programming technique. Furthermore, two simulation examples are included to verify the theoretical results. (C) 2012 Elsevier Inc. All rights reserved.
In this paper, an approximate dynamic programming (ADP) based strategy for real-time energy control of parallel hybrid electric vehicles(HEV) is presented. The aim is to develop a fuel-optimal control which is not rel...
详细信息
ISBN:
(纸本)9781424421138
In this paper, an approximate dynamic programming (ADP) based strategy for real-time energy control of parallel hybrid electric vehicles(HEV) is presented. The aim is to develop a fuel-optimal control which is not relying on the priori knowledge of the future driving conditions (global optimal control), but only upon the current system operation. approximate dynamic programming is an on-line learning method, which controls the system while simultaneously learning its characteristics in real time. A suboptimal energy control is then obtained with a proper definition of a cost function to be minimized at each time instant. The cost function includes the fuel consumption, emissions and the deviation of battery soc. Our approach guarantees an optimization of vehicle performance and an adaptation to driving conditions. Simulation results over standard driving cycles are presented to demonstrate the effectiveness of the proposed stochastic approach. It was found that the obtained ADP control algorithm outperforms a traditional rule-based control strategy.
This paper investigates the properties of integral value iteration (I-VI) which is one of the reinforcement learning (RL) technique for solving online the continuous-time (CT) optimal control problems without using th...
详细信息
ISBN:
(纸本)9781479901784
This paper investigates the properties of integral value iteration (I-VI) which is one of the reinforcement learning (RL) technique for solving online the continuous-time (CT) optimal control problems without using the system drift dynamics. The target I-VI is the one applied to CT linear quadratic regulation problems. As a result, two modes of global monotone convergence of I-VI are presented. One behaves like policy iteration (PI) (PI-mode of convergence) and the other is named VI-mode of convergence. All of the other properties-positive definiteness, stability, and relation between I-VI and integral PI-are presented within these two frameworks. Finally, numerical simulations are carried out to verify and further investigate these properties.
This paper contributes with a unified formulation that merges previous analysis on the prediction of the performance (value function) of certain sequence of actions (policy) when an agent operates a Markov decision pr...
详细信息
ISBN:
(纸本)9781479936878
This paper contributes with a unified formulation that merges previous analysis on the prediction of the performance (value function) of certain sequence of actions (policy) when an agent operates a Markov decision process with large state-space. When the states are represented by features and the value function is linearly approximated, our analysis reveals a new relationship between two common cost functions used to obtain the optimal approximation. In addition, this analysis allows us to propose an efficient adaptive algorithm that provides an unbiased linear estimate. The performance of the proposed algorithm is illustrated by simulation, showing competitive results when compared with the state-of-the-art solutions.
In this paper, we propose an online adaptive neural-algorithm to solve the CT nonlinear optimal control problems. Compared to the existing methods, which adopt the architecture with two neural networks (NNs) for actor...
详细信息
ISBN:
(纸本)9788993215052
In this paper, we propose an online adaptive neural-algorithm to solve the CT nonlinear optimal control problems. Compared to the existing methods, which adopt the architecture with two neural networks (NNs) for actor-critic implementations, only one NN for critic is used to implement the algorithm, simplifying the structure of the computation model. Moreover, we also provide a generalized learning rule for updating the NN weights, which covers the existing critic update rules as special cases. The theoretical and numerical results are given under the required persistent excitation condition to verify and analyze stability and performance of the proposed method.
We develop a family of rollout policies based on fixed routes to obtain dynamic solutions to the vehicle routing problem with stochastic demand and duration limits (VRPSDL). In addition to a traditional one-step rollo...
详细信息
We develop a family of rollout policies based on fixed routes to obtain dynamic solutions to the vehicle routing problem with stochastic demand and duration limits (VRPSDL). In addition to a traditional one-step rollout policy, we leverage the notions of the pre- and post-decision state to distinguish two additional rollout variants. We tailor our rollout policies by developing a dynamic decomposition scheme that achieves high quality solutions to large problem instances with reasonable computational effort. Computational experiments demonstrate that our rollout policies improve upon the performance of a rolling horizon procedure and commonly employed fixed-route policies, with improvement over the latter being more substantial.
暂无评论