In this paper we study optimal control problems in Wasserstein spaces, which are suitable to describe macroscopic dynamics of multi-particle systems. The dynamics is described by a parametrized continuity equation, in...
详细信息
In this paper we study optimal control problems in Wasserstein spaces, which are suitable to describe macroscopic dynamics of multi-particle systems. The dynamics is described by a parametrized continuity equation, in which the Eulerian velocity field is affine w.r.t. some variables. Our aim is to minimize a cost functional which includes a control norm, thus enforcing a control sparsity constraint. More precisely, we consider a nonlocal restriction on the total amount of control that can be used depending on the overall state of the evolving mass. We treat in details two main cases: an instantaneous constraint on the control applied to the evolving mass and a cumulative constraint, which depends also on the amount of control used in previous times. For both constraints, we prove the existence of optimal trajectories for general cost functions and that the value function is viscosity solution of a suitable Hamilton-Jacobi-Bellmann equation. Finally, we discuss an abstract dynamic programming principle, providing further applications in the Appendix. (C) 2019 Elsevier Inc. All rights reserved.
This paper studies the dynamic programming principle using the measurable selection method for stochastic control of continuous processes. The novelty of this work is to incorporate intermediate expectation constraint...
详细信息
This paper studies the dynamic programming principle using the measurable selection method for stochastic control of continuous processes. The novelty of this work is to incorporate intermediate expectation constraints on the canonical space at each time t. Motivated by some financial applications, we show that several types of dynamic trading constraints can be reformulated into expectation constraints on paths of controlled state processes. Our results can therefore be employed to recover the dynamic programming principle for these optimal investment problems under dynamic constraints, possibly path-dependent, in a non-Markovian framework.
In this paper, we investigate infinite horizon optimal control problems for parametrized partial differential equations. We are interested in feedback control via dynamicprogramming equations which is well-known to s...
详细信息
In this paper, we investigate infinite horizon optimal control problems for parametrized partial differential equations. We are interested in feedback control via dynamicprogramming equations which is well-known to suffer from the curse of dimensionality. Thus, we apply parametric model order reduction techniques to construct low-dimensional subspaces with suitable information on the control problem, where the dynamicprogramming equations can be approximated. To guarantee a low number of basis functions, we combine recent basis generation methods and parameter partitioning techniques. Furthermore, we present a novel technique to construct non-uniform grids in the reduced domain, which is based on statistical information. Finally, we discuss numerical examples to illustrate the effectiveness of the proposed methods for PDEs in two space dimensions.
In this work we study the stochastic recursive control problem, in which the aggregator (or generator) of the backward stochastic differential equation describing the running cost is continuous but not necessarily Lip...
详细信息
In this work we study the stochastic recursive control problem, in which the aggregator (or generator) of the backward stochastic differential equation describing the running cost is continuous but not necessarily Lipschitz with respect to the first unknown variable and the control, and monotonic with respect to the first unknown variable. The dynamic programming principle and the connection between the value function and the viscosity solution of the associated Hamilton-Jacobi-Bellman equation are established in this setting by the generalized comparison theorem for backward stochastic differential equations and the stability of viscosity solutions. Finally we take the control problem of continuous time Epstein Zin utility with non-Lipschitz aggregator as an example to demonstrate the application of our study.
Within the framework of viscosity solution, we study the relationship between the maximum principle (MP) from M. Hu, S. Ji and X. Xue [SIAM J. Control Optim. 56 (2018) 4309-4335] and the dynamic programming principle ...
详细信息
Within the framework of viscosity solution, we study the relationship between the maximum principle (MP) from M. Hu, S. Ji and X. Xue [SIAM J. Control Optim. 56 (2018) 4309-4335] and the dynamic programming principle (DPP) from M. Hu, S. Ji and X. Xue [SIAM J. Control Optim. 57 (2019) 3911-3938] for a fully coupled forward-backward stochastic controlled system (FBSCS) with a nonconvex control domain. For a fully coupled FBSCS, both the corresponding MP and the corresponding Hamilton-Jacobi-Bellman (HJB) equation combine an algebra equation respectively. With the help of a new decoupling technique, we obtain the desirable estimates for the fully coupled forward-backward variational equations and establish the relationship. Furthermore, for the smooth case, we discover the connection between the derivatives of the solution to the algebra equation and some terms in the first-order and second-order adjoint equations. Finally, we study the local case under the monotonicity conditions as from J. Li and Q. Wei [SIAM J. Control Optim. 52 (2014) 1622-1662] and Z. Wu [Syst. Sci. Math. Sci. 11 (1998) 249-259], and obtain the relationship between the MP from Z. Wu [Syst. Sci. Math. Sci. 11 (1998) 249-259] and the DPP from J. Li and Q. Wei [SIAM J. Control Optim. 52 (2014) 1622-1662].
In this paper, we study a stochastic recursive optimal control problem in which the cost functional is described by the solution of a backward stochastic differential equation driven by G-Brownian motion. Under standa...
详细信息
In this paper, we study a stochastic recursive optimal control problem in which the cost functional is described by the solution of a backward stochastic differential equation driven by G-Brownian motion. Under standard assumptions, we establish the dynamic programming principle and the related fully nonlinear HJB equation in the framework of G-expectation. Finally, we show that the value function is the viscosity solution of the obtained RIB equation. (C) 2016 Elsevier B.V. All rights reserved.
We analyze a stochastic optimal control problem, where the state process follows a McKean-Vlasov dynamics and the diffusion coefficient can be degenerate. We prove that its value function V admits a nonlinear FeynmanK...
详细信息
We analyze a stochastic optimal control problem, where the state process follows a McKean-Vlasov dynamics and the diffusion coefficient can be degenerate. We prove that its value function V admits a nonlinear FeynmanKac representation in terms of a class of forward-backward stochastic differential equations, with an autonomous forward process. We exploit this probabilistic representation to rigorously prove the dynamic programming principle (DPP) for V. The Feynman-Kac representation we obtain has an important role beyond its intermediary role in obtaining our main result: in fact it would be useful in developing probabilistic numerical schemes for V. The DPP is important in obtaining a characterization of the value function as a solution of a nonlinear partial differential equation (the so-called HamiltonJacobi-Belman equation), in this case on the Wasserstein space of measures. We should note that the usual way of solving these equations is through the Pontryagin maximum principle, which requires some convexity assumptions. There were attempts in using the dynamicprogramming approach before, but these works assumed a priori that the controls were of Markovian feedback type, which helps write the problem only in terms of the distribution of the state process (and the control problem becomes a deterministic problem). In this paper, we will consider open-loop controls and derive the dynamic programming principle in this most general case. In order to obtain the FeynmanKac representation and the randomized dynamic programming principle, we implement the so-called randomization method, which consists of formulating a new McKean-Vlasov control problem, expressed in weak form taking the supremum over a family of equivalent probability measures. One of the main results of the paper is the proof that this latter control problem has the same value function V of the original control problem.
We study the dynamic programming principle (DPP for short) on manifolds, obtain the Hamilton-Jacobi-Bellman (HJB for short) equation, and prove that the value function is the only viscosity solution to the HJB equatio...
详细信息
We study the dynamic programming principle (DPP for short) on manifolds, obtain the Hamilton-Jacobi-Bellman (HJB for short) equation, and prove that the value function is the only viscosity solution to the HJB equation. Then, we investigate the relation between DPP and Pontryagin's maximum principle (PMP for short), from which we obtain PMP on manifolds. (C) 2015 Elsevier Inc. All rights reserved.
The dynamic programming principle (DPP) is fundamental for control and optimization, including Markov decision problems (MDPs), reinforcement learning (RL), and, more recently, mean-field controls (MFCs). However, in ...
详细信息
The dynamic programming principle (DPP) is fundamental for control and optimization, including Markov decision problems (MDPs), reinforcement learning (RL), and, more recently, mean-field controls (MFCs). However, in the learning framework of MFCs, the DPP has not been rigorously established, despite its critical importance for algorithm designs. In this paper, we first present a simple example in MFCs with learning where the DPP fails with a misspecified Q function and then propose the correct form of Q function in an appropriate space for MFCs with learning. This particular form of Q function is different from the classical one and is called the IQ function. In the special case when the transition probability and the reward are independent of the mean-field information, it integrates the classical Q function for single-agent RL over the state-action distribution. In other words, MFCs with learning can be viewed as lifting the classical RLs by replacing the state-action space with its probability distribution space. This identification of the IQ function enables us to establish precisely the DPP in the learning framework of MFCs. Finally, we illustrate through numerical experiments the time consistency of this IQ function.
This paper is concerned with the relationship between maximum principle and dynamic programming principle for stochastic recursive optimal control problems of jump diffusions. Under the assumption that the value funct...
详细信息
This paper is concerned with the relationship between maximum principle and dynamic programming principle for stochastic recursive optimal control problems of jump diffusions. Under the assumption that the value function is smooth, relations among the adjoint processes, the generalized Hamiltonian function, and the value function are given. A linear quadratic recursive utility portfolio optimization problem in the financial market is discussed to show the applications of the main result. Copyright (c) 2012 John Wiley & Sons, Ltd.
暂无评论