版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Swiss Fed Inst Technol CH-8092 Zurich Switzerland Google CH-8002 Zurich Switzerland Stanford Univ Stanford CA 94305 USA
出 版 物:《IEEE ROBOTICS AND AUTOMATION LETTERS》 (IEEE Robot. Autom.)
年 卷 期:2025年第10卷第3期
页 面:2558-2565页
核心收录:
学科分类:0808[工学-电气工程] 08[工学] 0811[工学-控制科学与工程]
基 金:SNSF PostDoc.Mobility Fellowship Innosuisse [48727.1 IP-ICT] Swiss National Science Foundation Advanced Grant Max Planck ETH Center for Learning Systems (CLS)
主 题:Three-dimensional displays Image segmentation Search problems Object recognition Robots Image reconstruction Geometry Solid modeling Internet Instance segmentation Object detection RGB-D perception segmentation and categorization semantic scene understanding
摘 要:Open-vocabulary 3D segmentation enables exploration of 3D spaces using free-form text descriptions. Existing methods for open-vocabulary 3D instance segmentation primarily focus on identifying object-level instances but struggle with finer-grained scene entities such as object parts, or regions described by generic attributes. In this work, we introduce Search3D, an approach to construct hierarchical open-vocabulary 3D scene representations, enabling 3D search at multiple levels of granularity: fine-grained object parts, entire objects, or regions described by attributes like materials. Unlike prior methods, Search3D shifts towards a more flexible open-vocabulary 3D search paradigm, moving beyond explicit object-centric queries. For systematic evaluation, we further contribute a scene-scale open-vocabulary 3D part segmentation benchmark based on MultiScan, along with a set of open-vocabulary fine-grained part annotations on ScanNet++. Search3D outperforms baselines in scene-scale open-vocabulary 3D part segmentation, while maintaining strong performance in segmenting 3D objects and materials.