检索结果-内蒙古大学图书馆

simulation-based Optimization algorithms for Finite-Horizon Markov Decision Processes

simulation-TRANSACTIONS OF THE SOCIETY FOR MODELING AND simulation INTERNATIONAL 2008年第12期84卷 577-600页

作者： Bhatnagar, Shalabh Abdulla, Mohammed Shahid Indian Inst Sci Dept Comp Sci & Automat Bangalore 560012 Karnataka India Gen Motors India Sci Lab Bangalore Karnataka India

We develop four simulation-based algorithms for finite-horizon Markov decision processes. Two of these algorithms are developed for finite state and compact action spaces while the other two are for finite state and finite action spaces. Of the former two, one algorithm uses a linear parameterization for the policy, resulting in reduced memory complexity. Convergence analysis is briefly sketched and illustrative numerical experiments with the four algorithms are shown for a problem of flow control in communication networks.

关键词： Finite-horizon Markov decision processes simulation-based algorithms two-timescale stochastic approximation function approximation actor-critic algorithms normalized Hadamard matrices

来源：评论

学校读者我要写书评

暂无评论

ON THE RATES OF CONVERGENCE OF simulation-based OPTIMIZATION algorithms FOR OPTIMAL STOPPING PROBLEMS

引用

ANNALS OF APPLIED PROBABILITY 2011年第1期21卷 215-239页

作者： Belomestny, Denis Weierstrass Inst Appl Anal & Stochast D-10117 Berlin Germany

In this paper, we study simulation-based optimization algorithms for solving discrete time optimal stopping problems. Using large deviation theory for the increments of empirical processes, we derive optimal convergence rates for the value function estimate and show that they cannot be improved in general. The rates derived provide a guide to the choice of the number of simulated paths needed in optimization step, which is crucial for the good performance of any simulation-based optimization algorithm. Finally, we present a numerical example of solving optimal stopping problem arising in finance that illustrates our theoretical findings.

关键词： Optimal stopping simulation-based algorithms exponential inequalities empirical processes delta-entropy with bracketing

来源：评论

学校读者我要写书评

暂无评论

Learning algorithms or Markov decision processes with average cost

引用

SIAM JOURNAL ON CONTROL AND OPTIMIZATION 2001年第3期40卷 681-698页

作者： Abounadi, J Bertsekas, D Borkar, VS MIT Informat & Decis Syst Lab Cambridge MA 02139 USA Tata Inst Fundamental Res Sch Technol & Comp Sci Mumbai 400005 India

This paper gives the rst rigorous convergence analysis of analogues of Watkins's Q-learning algorithm, applied to average cost control of finite-state Markov chains. We discuss two algorithms which may be viewed as stochastic approximation counterparts of two existing algorithms for recursively computing the value function of the average cost problem the traditional relative value iteration (RVI) algorithm and a recent algorithm of Bertsekas based on the stochastic shortest path (SSP) formulation of the problem. Both synchronous and asynchronous implementations are considered and analyzed using the ODE method. This involves establishing asymptotic stability of associated ODE limits. The SSP algorithm also uses ideas from two-time-scale stochastic approximation.

关键词： simulation-based algorithms Q-learning controlled Markov chains average cost control stochastic approximation dynamic programming

来源：评论

学校读者我要写书评

暂无评论

Evolutionary Markov chain Monte Carlo algorithms for optimal monitoring network designs

引用

STATISTICAL METHODOLOGY 2012年第1-2期9卷 185-194页

作者： Ruiz-Cardenas, Ramiro Ferreira, Marco A. R. Schmidt, Alexandra M. Univ Missouri Dept Stat Columbia MO 65211 USA Univ Fed Minas Gerais Dept Estat Belo Horizonte MG Brazil Univ Fed Rio de Janeiro Inst Matemat BR-21941 Rio De Janeiro Brazil

We propose an evolutionary Markov chain Monte Carlo (eMCMC) framework for optimal design of large-scale monitoring networks. From a Bayesian decision theoretical perspective, the optimal design is the design that maximizes the expected utility. In the case of large-scale monitoring networks, the computation of the expected utility involves a very high dimensional integral with respect to future observations and unknown parameters. based on the work by Muller and coauthors, who have developed a clever simulation-based framework for Bayesian optimal design blending MCMC with simulated annealing, we develop an algorithm that simulates a population of Markov chains, each having its own temperature. The different temperatures allow hotter chains to more easily cross valleys and colder chains to rapidly climb hills. The population evolves according to genetic operators such as mutation and crossover, allowing the chains to explore the decision space both locally and globally by exchanging information among chains. As a result, our framework explores the decision space very effectively. We illustrate the power of the methodology we propose with the optimal redesign of a network of monitoring stations for spatiotemporal ground-level ozone in the eastern USA. (C) 2011 Elsevier B.V. All rights reserved.

关键词： Expected utility maximization Genetic algorithms Ozone network design simulation-based algorithms

来源：评论

学校读者我要写书评

暂无评论

The actor-critic algorithm as multi-time-scale stochastic approximation

引用

SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES 1997年第4期22卷 525-543页

作者： Borkar, VS Konda, VR Indian Inst Sci Dept Comp Sci & Automat Bangalore 560012 Karnataka India

The actor-critic algorithm of Barto and others for simulation-based optimization of Markov decision processes is cast as a two time Scale stochastic approximation. Convergence analysis, approximation issues and an exa... 详细信息

关键词： actor-critic algorithm stochastic approximation Markov decision processes simulation-based algorithms policy iteration

来源：评论

学校读者我要写书评

暂无评论

Network Coding-based Wireless Media Transmission Using POMDP

Network Coding-Based Wireless Media Transmission Using POMDP

引用

17th International Packet Video Workshop

作者： Nguyen, Dong Nguyen, Thinh Oregon State Univ Sch Elect Engn & Comp Sci Corvallis OR 97331 USA

ISBN: (纸本)9781424446513

We consider the problem of joint network coding and packet scheduling for multimedia transmission from the Access Point (AP) to multiple receivers in 802.11 networks. The state of receivers is described by a hidden Markov model and the AP acts as a decision maker which employs a partially observable Markov decision process (POMDP) to optimize the media transmission. Importantly, we introduce a simulation-based dynamic programming algorithm as a solution tool for our POMDP abstract. Our simulation-based algorithm simplifies the modeling process as well as reduces the computational complexity of the solution process. Our simulation results demonstrate that the proposed scheme provides higher performance than the network coding scheme without using optimization techniques and traditional retransmission scheme.

关键词： Wireless multimedia streaming network coding scheduling simulation-based algorithms hidden Markov model partially observable Markov decision process

来源：评论

学校读者我要写书评

暂无评论

SOLVING OPTIMAL STOPPING PROBLEMS VIA EMPIRICAL DUAL OPTIMIZATION

引用

ANNALS OF APPLIED PROBABILITY 2013年第5期23卷 1988-2019页

作者： Belomestny, Denis Duisburg Essen Univ D-45127 Essen Germany

In this paper we consider a method of solving optimal stopping problems in discrete and continuous time based on their dual representation. A novel and generic simulation-based optimization algorithm not involving nested simulations is proposed and studied. The algorithm involves the optimization of a genuinely penalized dual objective functional over a class of adapted martingales. We prove the convergence of the proposed algorithm and demonstrate its efficiency for optimal stopping problems arising in option pricing.

关键词： Optimal stopping simulation-based algorithms functional optimization empirical variance self-normalized processes

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：