检索结果-内蒙古大学图书馆

control Randomisation Approach for Policy Gradient and Application to Reinforcement Learning in Optimal Switching

APPLIED MATHEMATICS AND OPTIMIZATION 2025年第1期91卷 1-33页

作者： Denkert, Robert Pham, Huyen Warin, Xavier Humboldt Univ Dept Math Berlin Germany Ecole Polytech CMAP Palaiseau France Lab Finance Marches Energie EDF R&D Palaiseau France Lab Finance Marches Energie FiME Palaiseau France

We propose a comprehensive framework for policy gradient methods tailored to continuous time reinforcement learning. This is based on the connection between stochastic control problems and randomised problems, enabling applications across various classes of Markovian continuous time control problems, beyond diffusion models, including e.g. regular, impulse and optimal stopping/switching problems. By utilizing change of measure in the control randomisation technique, we derive a new policy gradient representation for these randomised problems, featuring parametrised intensity policies. We further develop actor-critic algorithms specifically designed to address general Markovian stochastic control issues. Our framework is demonstrated through its application to optimal switching problems, with two numerical case studies in the energy sector focusing on real options.

关键词： Reinforcement learning in continuous time Policy gradient control randomization Actor-critic algorithms Optimal switching.

来源：评论

学校读者我要写书评

暂无评论

Optimal stopping of BSDEs with constrained jumps and related zero-sum games

引用

STOCHASTIC PROCESSES AND THEIR APPLICATIONS 2024年 173卷

作者： Perninge, Magnus Linnaeus Univ Vaxjo Sweden

In this paper, we introduce a non -linear Snell envelope which at each time represents the maximal value that can be achieved by stopping a BSDE with constrained jumps. We establish the existence of the Snell envelope by employing a penalization technique and the primary challenge we encounter is demonstrating the regularity of the limit for the scheme. Additionally, we relate the Snell envelope to a finite horizon, zero -sum stochastic differential game, where one player controls a path -dependent stochastic system by invoking impulses, while the opponent is given the opportunity to stop the game prematurely. Importantly, by developing new techniques within the realm of control randomization, we demonstrate that the value of the game exists and is precisely characterized by our non -linear Snell envelope.

关键词： Backward stochastic differential equations control randomization Impulse control L & eacute vy process Non-Markovian Optimal stopping

来源：评论

学校读者我要写书评

暂无评论

New Regression Monte Carlo Methods for High-dimensional Real Options Problems in Minerals industry 21

New Regression Monte Carlo Methods for High-dimensional Real...

引用

21st International Congress on Modelling and Simulation (MODSIM) held jointly with the 23rd National Conference of the Australian-Society-for-Operations-Research / DSTO led Defence Operations Research Symposium (DORS

作者： Langrene, N. Tarnopolskaya, T. Chen, W. Zhu, Z. Cooksey, M. CSIRO Clayton Vic 3168 Australia CSIRO N Ryde NSW 2113 Australia

ISBN: (纸本)9780987214355

Mining operations are affected by significant uncertainty in commodity prices, combined with geological uncertainties (both in quantity and quality of the available reserves). Technical difficulties and costs associated with ore extraction together with a highly uncertain environment present significant risks for profitability of mineral projects. Optimising operating strategies in response to changing market conditions and information about the available reserves is crucial for project profitability in the face of uncertainty. A natural resource extraction problem can be viewed as a stochastic optimal control (real options) problem, with extraction rate representing a control variable. In a finite horizon, finite reserve setting, an additional complexity arises from the need to consider a large number of feasible remaining reserve levels, which significantly increases the computational complexity of the algorithms. Extraction of a natural resource problems have attracted the attention of researchers in the fields of real options and stochastic optimal control since the 1980s. However, there is still no computational framework available that would allow realistic high-dimensional real options problems in minerals industry to be solved. Over the last decade, the approach based on value function approximation via basis functions has attracted significant attention from financial applications, and has given rise to a class of methods known as regression Monte Carlo methods. Regression Monte Carlo is a very versatile simulation-based technique. It can deal with a rich description of the mining problem, and very elaborate models for the risk factors. In this paper, we propose to combine several crucial improvements to make the regression Monte Carlo method practical for multi-dimensional models: 1) Firstly, we avoid the discretisation of reserve level by using the control randomization technique. First, the reserve is replaced by a dummy random factor during the forward

关键词： Real option stochastic control Monte Carlo control randomization memory reduction

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：