We introduce a general framework for Markov decision problems under model uncertainty in a discrete-time infinite horizon setting. By providing a dynamic programming principle, we obtain a local-to-global paradigm, na...
详细信息
We introduce a general framework for Markov decision problems under model uncertainty in a discrete-time infinite horizon setting. By providing a dynamic programming principle, we obtain a local-to-global paradigm, namely solving a local, that is, a one time-step robust optimization problem leads to an optimizer of the global (i.e., infinite time-steps) robust stochastic optimal control problem, as well as to a corresponding worst-case measure. Moreover, we apply this framework to portfolio optimization involving data of the S&P500$S\&P\nobreakspace 500$. We present two different types of ambiguity sets;one is fully data-driven given by a Wasserstein-ball around the empirical measure, the second one is described by a parametric set of multivariate normal distributions, where the corresponding uncertainty sets of the parameters are estimated from the data. It turns out that in scenarios where the market is volatile or bearish, the optimal portfolio strategies from the corresponding robust optimization problem outperforms the ones without model uncertainty, showcasing the importance of taking model uncertainty into account.
In this survey work, we introduce Stochastic Differential Delay Equations and their impacts on Stochastic Optimal Control problems. We observe time delay in the dynamics of a state process that may correspond to inert...
详细信息
In this survey work, we introduce Stochastic Differential Delay Equations and their impacts on Stochastic Optimal Control problems. We observe time delay in the dynamics of a state process that may correspond to inertia or memory in a financial system. For such systems, we demonstrate two special approaches to handle delayed control problems by applying the dynamic programming principle. Moreover, we clarify the technical challenges rising as a consequence of the conflict between the path-dependent, infinite-dimensional nature of the problem and the necessity of the Markov property. Furthermore, we present two different Deep Learning algorithms to solve targeted delayed control tasks and illustrate the results for a complete memory portfolio optimization problem.
We consider a deterministic optimal control problem, focusing on a finite horizon scenario. Our proposal involves employing deep neural network approximations to capture Bellman's dynamic programming principle. Th...
详细信息
We consider a deterministic optimal control problem, focusing on a finite horizon scenario. Our proposal involves employing deep neural network approximations to capture Bellman's dynamic programming principle. This also corresponds to solving first-order Hamilton-Jacobi-Bellman (HJB) equations. Our work builds upon the research conducted by Hur & eacute;et al. (SIAM J Numer Anal 59(1):525-557, 2021), which primarily focused on stochastic contexts. However, our objective is to develop a completely novel approach specifically designed to address error propagation in the absence of diffusion in the dynamics of the system. Our analysis provides precise error estimates in terms of an average norm. Furthermore, we provide several academic numerical examples that pertain to front propagation models incorporating obstacle constraints, demonstrating the effectiveness of our approach for systems with moderate dimensions (e.g., ranging from 2 to 8) and for nonsmooth value functions.
Nonzero sum games typically have multiple Nash equilibriums (or no equilibrium), and unlike the zero-sum case, they may have different values at different equilibriums. Instead of focusing on the existence of individu...
详细信息
Nonzero sum games typically have multiple Nash equilibriums (or no equilibrium), and unlike the zero-sum case, they may have different values at different equilibriums. Instead of focusing on the existence of individual equilibriums, we study the set of values over all equilibriums, which we call the set value of the game. The set value is unique by nature and always exists (with possible value 0). Similar to the standard value function in control literature, it enjoys many nice properties, such as regularity, stability, and more importantly, the dynamic programming principle. There are two main features in order to obtain the dynamic programming principle: (i) we must use closed-loop controls (instead of open-loop controls);and (ii) we must allow for path dependent controls, even if the problem is in a state-dependent (Markovian) setting. We shall consider both discrete and continuous time models with finite time horizon. For the latter, we will also provide a duality approach through certain standard PDE (or path-dependent PDE), which is quite efficient for numerically computing the set value of the game.
In this paper we consider an infinite time horizon risk-sensitive optimal stopping problem for a Feller-Markov process with an unbounded terminal cost function. We show that in the unbounded case an associated Bellman...
详细信息
In this paper we consider an infinite time horizon risk-sensitive optimal stopping problem for a Feller-Markov process with an unbounded terminal cost function. We show that in the unbounded case an associated Bellman equation may have multiple solutions and we give a probabilistic interpretation for the minimal and the maximal one. Also, we show how to approximate them using finite time horizon problems. The analysis, covering both discrete and continuous time case, is supported with illustrative examples.
We consider a two-player zero-sum-game in a bounded open domain Omega described as follows: at a point x epsilon Omega, Players I and II play an epsilon-step tug-of-war game with probability alpha, and with probabilit...
详细信息
We consider a two-player zero-sum-game in a bounded open domain Omega described as follows: at a point x epsilon Omega, Players I and II play an epsilon-step tug-of-war game with probability alpha, and with probability beta (alpha + beta = 1), a random point in the ball of radius epsilon centered at x is chosen. Once the game position reaches the boundary, Player II pays Player I the amount given by a fixed payoff function F. We give a detailed proof of the fact that the value functions of this game satisfy the dynamic programming principle u(x) -alpha/2 {sup u(y)(y is an element of(B) over bar epsilon(x)) + inf(y is an element of(B) over bar epsilon(x)) u(y)} + beta f(B epsilon(x)) u(y)dy, for x is an element of Omega with u( y) = F( y) when y is not an element of Omega. This principle implies the existence of quasioptimal Markovian strategies.
The assertions of Proposition 3.7 in our paper ``The robust superreplication problem: A dynamic approach"" [L. Carassus, J. Ob\lo'\j, and J. Wiesel, SIAM J. Financial Math., 10 (2019), pp. 907--941] may ...
详细信息
The assertions of Proposition 3.7 in our paper ``The robust superreplication problem: A dynamic approach"" [L. Carassus, J. Ob\lo'\j, and J. Wiesel, SIAM J. Financial Math., 10 (2019), pp. 907--941] may fail to hold without an additional assumption, which we detail in this erratum.
The study of epidemics using mathematical modelling is critical in understanding its dynamics and proposing potential control measures. We propose a generalised epidemiological model corresponding to a pandemic wherei...
详细信息
The study of epidemics using mathematical modelling is critical in understanding its dynamics and proposing potential control measures. We propose a generalised epidemiological model corresponding to a pandemic wherein its dynamics is represented as a novel hybrid system obtained by coupling a deterministic model with a stochastic model. The hybrid system dynamics is established in individualistic (macroscopic) and intraindividualistic (microscopic) scales. The established hybrid system is then considered the basis for an optimal control problem, with the rate of vaccination and velocity of spatial dynamics taken as the control parameters affecting the system's trajectory. We define the cost functional constituted by the continuous cost corresponding to the deterministic model and discrete costs corresponding to the transitions in the microscopic scale. The objective of the control problem is to find an optimal control pair of vaccination rate and spatial velocity, which minimises the cost functional. We use the dynamic programming principle (DPP) as the optimisation technique, followed by verification of the value function obtained by DPP as a viscosity solution of the appropriate Hamilton-Jacobi-Bellman equation to analyse the existence of an optimal control pair to the hybrid system. We prove the existence of optimal controls to the multi -scale dynamics for pandemic modelling, along with an abstract method to synthesise it.
In this paper, we study one kind of stochastic recursive optimal control problem for the systems described by stochastic differential equations with delay (SDDE). In our framework, not only the dynamics of the systems...
详细信息
In this paper, we study one kind of stochastic recursive optimal control problem for the systems described by stochastic differential equations with delay (SDDE). In our framework, not only the dynamics of the systems but also the recursive utility depend on the past path segment of the state process in a general form. We give the dynamic programming principle for this kind of optimal control problems and show that the value function is the viscosity solution of the corresponding infinite dimensional Hamilton-Jacobi-Bellman partial differential equation.
In this paper, we consider an insurance company that is active in multiple dependent lines. We assume that the risk process in each line is a Cramer-Lundberg process. We use a common shock dependency structure to cons...
详细信息
In this paper, we consider an insurance company that is active in multiple dependent lines. We assume that the risk process in each line is a Cramer-Lundberg process. We use a common shock dependency structure to consider the possibility of simultaneous claims in different lines. According to a vector of reinsurance strategies, the insurer transfers some part of its risk to a reinsurance company. Our goal is to maximize our objective function (expected discounted surplus level integrated over time) using a dynamicprogramming method. The optimal objective function (value function) is characterized as the unique solution of the corresponding Hamilton-Jacobi-Bellman equation with some boundary conditions. Moreover, an algorithm is proposed to numerically obtain the optimal solution of the objective function, which corresponds to the optimal reinsurance strategies.
暂无评论