咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >TiGDistill-BEV: Multi-view BEV... 收藏
arXiv

TiGDistill-BEV: Multi-view BEV 3D Object Detection via Target Inner-Geometry Learning Distillation

作     者:Xu, Shaoqing Li, Fang Huang, Peixiang Song, Ziying Yang, Zhi-Xin 

作者机构:State Key Laboratory of Internet of Things for Smart City Department of Electromechanical Engineering University of Macau 999078 China School of Mechanical Engineering Beijing Institute of Technology China College of Engineering Peking University Beijing China School of Computer and Information Technology Beijing Key Lab of Traffic Data Analysis and Mining Beijing Jiaotong University Beijing100044 China 

出 版 物:《arXiv》 (arXiv)

年 卷 期:2024年

核心收录:

主  题:Benchmarking 

摘      要:Accurate multi-view 3D object detection is essential for applications such as autonomous driving. Researchers have consistently aimed to leverage LiDAR’s precise spatial information to enhance camera-based detectors through methods like depth supervision and bird-eye-view (BEV) feature distillation. However, existing approaches often face challenges due to the inherent differences between LiDAR and camera data representations. In this paper, we introduce the TiGDistill-BEV, a novel approach that effectively bridges this gap by leveraging the strengths of both sensors. Our method distills knowledge from diverse modalities(e.g., LiDAR) as the teacher model to a camera-based student detector, utilizing the Target Inner-Geometry learning scheme to enhance camera-based BEV detectors through both depth and BEV features by leveraging diverse modalities. Specially, we propose two key modules: an inner-depth supervision module to learn the low-level relative depth relations within objects which equips detectors with a deeper understanding of object-level spatial structures, and an inner-feature BEV distillation module to transfer high-level semantics of different keypoints within foreground targets. To further alleviate the domain gap, we incorporate both inter-channel and inter-keypoint distillation to model feature similarity. Extensive experiments on the nuScenes benchmark demonstrate that TiGDistill-BEV significantly boosts camera-based only detectors achieving a state-of-the-art with 62.8% NDS and surpassing previous methods by a significant margin. The codes is available at: https://***/Public-BOTs/*** Copyright © 2024, The Authors. All rights reserved.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分