文献详情 >Algorithms for partially obser... 收藏

Algorithms for partially observable Markov decision processe...

Algorithms for partially observable Markov decision processes

作者单位：Hong Kong University of Science and Technology (Hong Kong)

学位级别：Ph.D.

授予年度：2001年

主题：Markov processes Statistical decision Dynamic programming Computer algorithms

摘要：Partially Observable Markov Decision Process (POMDP) is a general sequential decision-making model where the effects of actions are nondeterministic and only partial information about world states is available. However, finding near optimal solutions for POMDPs is computationally difficult. Value iteration is a standard algorithm for solving POMDPs. It conducts a sequence of dynamic programming (DP) updates to improve value functions. Value iteration is inefficient for two reasons. First, a DP update is expensive due to the need of accounting for all belief states in a continuous belief space. Second, value iteration needs to conduct a large number of DP updates before its convergence. This thesis investigates two ways to accelerate value iteration. The work presented centers around the idea of conducting DP updates and therefore value iteration over a belief subspace, a subset of belief space. The first use of belief subspace is to reduce the number of DP updates for value iteration to converge. We design a computationally cheap procedure considering a belief subspace which consists of a finite number of belief states. It is used as an additional step for improving value functions. Due to additional improvements by the procedure, value iteration conducts fewer DP updates and therefore is more efficient. The second use of belief subspace is to reduce the complexity of DP updates. We establish a framework on how to carry out value iteration over a belief subspace determined by a POMDP model. Whether the belief subspace is smaller than the belief space is model dependent. If this is true for a POMDP, value iteration over the belief subspace is expected to be more efficient. Based on this framework, we study three POMDP classes with special problem characteristics and propose different value iteration algorithms for them. (1) An informative POMDP assumes that an agent always has a good idea about the world states. The subspace determined by the model is much smaller th

本地馆藏 | 借阅须知 | 我要预约

已订购，未入库

sda

目录详情 | 试阅读 |

读者评论与其他读者分享你的观点

学校读者

用户名:未登录

我的评分

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

Algorithms for partially observable Markov decision processes

读者评论与其他读者分享你的观点

请选择收藏分类：

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

Algorithms for partially observable Markov decision processes

读者评论 与其他读者分享你的观点

请选择收藏分类： 新增自定义分类 确定 取消

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

读者评论与其他读者分享你的观点

请选择收藏分类：