conference proceedings front matter may contain various advertisements, welcome messages, committee or program information, and other miscellaneous conference information. this may in some cases also include the cover...
conference proceedings front matter may contain various advertisements, welcome messages, committee or program information, and other miscellaneous conference information. this may in some cases also include the cover art, table of contents, copyright statements, title-page or half title-pages, blank pages, venue maps or other general information relating to the conferencethat was part of the original conference proceedings.
conference proceedings front matter may contain various advertisements, welcome messages, committee or program information, and other miscellaneous conference information. this may in some cases also include the cover...
conference proceedings front matter may contain various advertisements, welcome messages, committee or program information, and other miscellaneous conference information. this may in some cases also include the cover art, table of contents, copyright statements, title-page or half title-pages, blank pages, venue maps or other general information relating to the conferencethat was part of the original conference proceedings.
this paper investigates a challenging scene in robot daily grasping services, where the target exists obstruction and visual ambiguity. Specifically, the fine-grained visual information of the target object is obstruc...
详细信息
ISBN:
(纸本)9789819607822;9789819607839
this paper investigates a challenging scene in robot daily grasping services, where the target exists obstruction and visual ambiguity. Specifically, the fine-grained visual information of the target object is obstructed, e.g., text information. Besides, there are multiple objects similar to the appearance of the target object in the scene. To tackle this issue, this paper proposes an active target location and grasping framework based on joint language-vision-action. Firstly, we take the textual label information of the target object as a clue to guide the robot to explore objects. then, to obtain more fine-grained text information about the objects in the scene, we enable the robot actively to pick up the object for observation like humans. Consequently, the robot can acquire the target object with detailed context from multiple objects and eliminate visual ambiguity. Finally, extensive experiments are conducted in both simulation and real-world scenes to verify the effectiveness of the proposed active target location and grasping system.
暂无评论