咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Learning for multi-robot coope... 收藏
arXiv

Learning for multi-robot cooperation in partially observable stochastic environments with macro-actions

作     者:Liu, Miao Sivakumar, Kavinayan Omidshafiei, Shayegan Amato, Christopher How, Jonathan P. 

作者机构:IBM T. J. Watson Research Center Yorktown HeightsNY United States Department of Electrical Engineering Princeton University PrincetonNJ United States Laboratory of Information and Decision Systems Massachusetts Institute of Technology CambridgeMA United States College of Computer and Information Science Northeastern University BostonMA United States 

出 版 物:《arXiv》 (arXiv)

年 卷 期:2017年

核心收录:

主  题:Markov processes 

摘      要:This paper presents a data-driven approach for multi-robot coordination in partially-observable domains based on Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) and macro-actions (MAs). Dec-POMDPs provide a general framework for cooperative sequential decision making under uncertainty and MAs allow temporally extended and asynchronous action execution. To date, most methods assume the underlying Dec-POMDP model is known a priori or a full simulator is available during planning time. Previous methods which aim to address these issues suffer from local optimality and sensitivity to initial conditions. Additionally, few hardware demonstrations involving a large team of heterogeneous robots and with long planning horizons exist. This work addresses these gaps by proposing an iterative sampling based Expectation-Maximization algorithm (iSEM) to learn polices using only trajectory data containing observations, MAs, and rewards. Our experiments show the algorithm is able to achieve better solution quality than the state-of-the-art learning-based methods. We implement two variants of multi-robot Search and Rescue (SAR) domains (with and without obstacles) on hardware to demonstrate the learned policies can effectively control a team of distributed robots to cooperate in a partially observable stochastic environment. Copyright © 2017, The Authors. All rights reserved.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分