版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Tencent LightSpeed Studios Shenzhen 518054 Peoples R China Nankai Univ Coll Comp Sci Tianjin 300350 Peoples R China
出 版 物:《IEEE ROBOTICS AND AUTOMATION LETTERS》 (IEEE Robot. Autom.)
年 卷 期:2023年第8卷第4期
页 面:2229-2236页
核心收录:
学科分类:0808[工学-电气工程] 08[工学] 0811[工学-控制科学与工程]
主 题:Navigation Task analysis Heuristic algorithms Trajectory Training Kernel Mathematical models Multi-robot systems robotics and automation swarm robotics
摘 要:We present an end-to-end differentiable learning algorithm for multi-agent navigation policies. Compared with prior model-free learning algorithms, our method leads to a significant speedup via the gradient information. Our key innovation lies in a novel differentiability analysis of the optimization-based crowd simulation algorithm via the implicit function theorem. Inspired by continuum multi-agent modeling techniques, we further propose a kernel-based policy parameterization, allowing our learned policy to scale up to an arbitrary number of agents without re-training. We evaluate our algorithm on two tasks in obstacle-rich environments, partially labeled navigation and evacuation, for which loss functions can be defined making the entire task learnable in an end-to-end manner. The results show that our method can achieve more than one order of magnitude speedup over model-free baselines and readily scale to unseen target configurations and agent sizes.