检索结果-内蒙古大学图书馆

A heuristic policy for maintaining multiple multi-state systems

RELIABILITY ENGINEERING & SYSTEM SAFETY 2020年 203卷 107081-107081页

作者： Zhang, Mimi Trinity Coll Dublin Sch Comp Sci & Stat Dublin Ireland

This work is concerned with the optimal allocation of limited maintenance resources among a collection of competing multi-state systems, and the dynamic of each multi-state system is modelled by a Markov chain. Determining the optimal dynamic maintenance policy is prohibitively difficult, and hence we propose a heuristic dynamic maintenance policy in which maintenance resources are allocated to systems with higher importance. The importance measure is well justified by the idea of subsidy, yet the computation is expensive. Hence, we further propose two modifications of the importance measure, resulting in two modified heuristic policies. The performance of the two modified heuristics is evaluated in a systematic computational study, showing exceptional competence.

关键词： approximate linear programming Expected discounted reward Partially observable Markov decision process

来源：评论

学校读者我要写书评

暂无评论

Stochastic Primal-Dual Method for Learning Mixture Policies in Markov Decision Processes 58

Stochastic Primal-Dual Method for Learning Mixture Policies ...

引用

58th IEEE Conference on Decision and Control (CDC)

作者： Khuzani, Masoud Badiei Vasudevan, Varun Ren, Hongyi Xing, Lei Stanford Univ Dept Management Sci Stanford CA 94305 USA Stanford Univ Dept Radiat Oncol Stanford CA 94305 USA Stanford Univ Inst Computat & Math Engn ICME Stanford CA 94305 USA

ISBN: (纸本)9781728113982

We investigate the problem of learning efficient policy for an infinite-horizon, discounted cost, Markov decision process (MDP) with a large number of states. We compute the actions of a policy that is nearly as good as a policy chosen by a suitable oracle from a given mixture policy class characterized by the convex hull of a set of base policies. To learn the coefficients of the mixture model, we recast the problem as an approximate linear programming (ALP) formulation for MDPs, where the feature vectors correspond to the occupation measures of base policies on the state-action space. We then propose a projection-free stochastic primal-dual method with Bregman divergence to solve the characterized ALP. Furthermore, we analyze the efficiency of the proposed stochastic algorithm, namely the number of rounds required to achieve near optimal objective value. Numerical results show that the proposed primal-dual algorithm achieves better efficiency and lower variance across different trials compared to the penalty function method.

关键词： Markov Decision Processes approximate linear programming Stochastic Primal-Dual Algorithm

来源：评论

学校读者我要写书评

暂无评论

Hierarchical Multi-skill Resource Assignment in the Telecommunications Industry

引用

PRODUCTION AND OPERATIONS MANAGEMENT 2014年第3期23卷 489-503页

作者： Barz, Christiane Kolisch, Rainer Univ Calif Los Angeles Anderson Sch Management Los Angeles CA 90095 USA Tech Univ Munich TUM Sch Management D-80333 Munich Germany

We formulate a discrete time Markov decision process for a resource assignment problem for multi-skilled resources with a hierarchical skill structure to minimize the average penalty and waiting costs for jobs with different waiting costs and uncertain service times. In contrast to most queueing models, our application leads to service times that are known before the job is actually served but only after it is accepted and assigned to a server. We formulate the corresponding Markov decision process, which is intractable for problems of realistic size due to the curse of dimensionality. Using an affine approximation of the bias function, we develop a simple linear program that yields a lower bound for the minimum average costs. We suggest how the solution of the linear program can be used in a simple heuristic and illustrate its performance in numerical examples and a case study.

关键词： Markov decision process approximate linear programming multi-skilled resources admission control queueing

来源：评论

学校读者我要写书评

暂无评论

A framework and a mean-field algorithm for the local control of spatial processes

引用

INTERNATIONAL JOURNAL OF approximate REASONING 2012年第1期53卷 66-86页

作者： Sabbadin, Regis Peyrard, Nathalie Forsell, Nicklas INRA Unite Biometrie & Intelligence Artificielle UR 87 Dept Math & Informat Appliquees Ctr Toulouse F-31326 Castanet Tolosan France MINES ParisTech Ctr Appl Math Paris France

The Markov Decision Process (MDP) framework is a tool for the efficient modelling and solving of sequential decision-making problems under uncertainty. However, it reaches its limits when state and action spaces are large, as can happen for spatially explicit decision problems. Factored MDPs and dedicated solution algorithms have been introduced to deal with large factored state spaces. But the case of large action spaces remains an issue. In this article, we define graph-based Markov Decision Processes (GMDPs), a particular Factored MDP framework which exploits the factorization of the state space and the action space of a decision problem. Both spaces are assumed to have the same dimension. Transition probabilities and rewards are factored according to a single graph structure, where nodes represent pairs of state/decision variables of the problem. The complexity of this representation grows only linearly with the size of the graph, whereas the complexity of exact resolution grows exponentially. We propose an approximate solution algorithm exploiting the structure of a GMDP and whose complexity only grows quadratically with the size of the graph and exponentially with the maximum number of neighbours of any node. This algorithm, referred to as MF-API, belongs to the family of approximate Policy Iteration (API) algorithms. It relies on a mean-field approximation of the value function of a policy and on a search limited to the suboptimal set of local policies. We compare it, in terms of performance, with two state-of-the-art algorithms for Factored MDPs: SPUDD and approximate linear programming (ALP). Our experiments show that SPUDD is not generally applicable to solving GMDPs, due to the size of the action space we want to tackle. On the other hand, ALP can be adapted to solve GMDPs. We show that ALP is faster than MF-API and provides solutions of similar quality for most problems. However, for some problems MF-API provides significantly better policies, and in all

关键词： Decision-theoretic planning Factored Markov decision processes approximate policy iteration Mean-field principle approximate linear programming

来源：评论

学校读者我要写书评

暂无评论

Symmetric approximate linear programming for factored MDPs with application to constrained problems

引用

ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE 2006年第3-4期47卷 273-293页

作者： Dolgov, Dmitri A. Durfee, Edmund H. Toyota Tech Ctr Tech Res Dept AI & Robot Grp Ann Arbor MI 48105 USA Univ Michigan Ann Arbor MI 48109 USA

A weakness of classical Markov decision processes (MDPs) is that they scale very poorly due to the flat state-space representation. Factored MDPs address this representational problem by exploiting problem structure to specify the transition and reward functions of an MDP in a compact manner. However, in general, solutions to factored MDPs do not retain the structure and compactness of the problem representation, forcing approximate solutions, with approximate linear programming (ALP) emerging as a promising MDP-approximation technique. To date, most ALP work has focused on the primal-LP formulation, while the dual LP, which forms the basis for solving constrained Markov problems, has received much less attention. We show that a straightforward linear approximation of the dual optimization variables is problematic, because some of the required computations cannot be carried out efficiently. Nonetheless, we develop a composite approach that symmetrically approximates the primal and dual optimization variables (effectively approximating both the objective function and the feasible region of the LP), leading to a formulation that is computationally feasible and suitable for solving constrained MDPs. We empirically show that this new ALP formulation also performs well on unconstrained problems.

关键词： Markov decision processes approximate linear programming primal-LP formulation dual LP constrained Markov problems

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：