版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:The State Key Laboratory of Multimodal Artificial Intelligence Systems Institute of Automation Chinese Academy of Sciences Beijing100190 China The School of Computer and Information Engineering Hubei Normal University Huangshi435002 China The School of Artificial Intelligence University of Chinese Academy of Sciences Beijing100049 China The School of Information Shanxi University of Finance and Economics Taiyuan030006 China The CAS Center for Excellence in Brain Science and Intelligence Technology Beijing100190 China The Joint Laboratory of Intelligence Science and Technology Institute of Systems Engineering Macau University of Science and Technology Taipa China
出 版 物:《arXiv》 (arXiv)
年 卷 期:2024年
核心收录:
主 题:Adversarial machine learning
摘 要:Current robot learning algorithms for acquiring novel skills often rely on demonstration datasets or environment interactions, resulting in high labor costs and potential safety risks. To address these challenges, this study proposes a skill-learning framework that enables robots to acquire novel skills from natural language instructions. The proposed pipeline leverages vision-language models to generate demonstration videos of novel skills, which are processed by an inverse dynamics model to extract actions from the unlabeled demonstrations. These actions are subsequently mapped to environmental contexts via imitation learning, enabling robots to learn new skills effectively. Experimental evaluations in the MetaWorld simulation environments demonstrate the pipeline’s capability to generate high-fidelity and reliable demonstrations. Using the generated demonstrations, various skill learning algorithms achieve an accomplishment rate three times the original on novel tasks. These results highlight a novel approach to robot learning, offering a foundation for the intuitive and intelligent acquisition of novel robotic skills. Copyright © 2024, The Authors. All rights reserved.