文献详情 >Dual-domain deformable feature... 收藏

Dual-domain deformable feature fusion for multi-modal 3D object detection

作者机构：Chongqing Jiaotong Univ Sch Mechatron & Vehicle Engn Chongqing Peoples R China Chongqing Jiaotong Univ Sch Aeronaut Chongqing Peoples R China Chongqing Key Lab Green Aviat Energy & Power Chongqing Peoples R China Chongqing Jiaotong Univ Green Aerotech Res Chongqing Peoples R China

出版物：《JOURNAL OF ELECTRONIC IMAGING》 (J. Electron. Imaging)

年卷期：2024年第33卷第6期

核心收录：

学科分类：0808[工学-电气工程] 1002[医学-临床医学] 0809[工学-电子科学与技术（可授工学、理学学位）] 08[工学] 0702[理学-物理学]

基　　金：National Natural Science Foundation of China Major Project of Science and Technology Research Program of Chongqing Education Commission of China [KJZD-M202400703] Natural Science Ranking Projects of Chongqing Jiaotong University [XJ2023000701] Team Building Project for Graduate Tutors in Chongqing [JDDSTD2022007] Joint Training Base Construction Project for Graduate Students in Chongqing [JDLHPYJD2022001, JDLHPYJD2023002]

主　　题：3D object detection multi-modal fusion feature alignment deformable attention autonomous driving

摘要：Recent advancements in 3D object detection using light detection and ranging (LiDAR)-camera fusion have enhanced autonomous driving perception. However, aligning LiDAR and image data during multimodal fusion remains a significant challenge. We propose a novel multi-modal feature alignment and fusion architecture to effectively align and fuse voxel and image data. The proposed architecture comprises four key modules. Z-axis attention aggregates voxel features along the vertical axis using self-attention. Voxel-domain deformable encoder improves context understanding with deformable attention to encode voxel features. Dual-domain deformable feature alignment uses deformable attention to adaptively align voxel and image features, addressing resolution mismatches. Finally, gated fusion utilizes a gating mechanism to dynamically fuse aligned features. The multi-layer design further enhances feature detail retention and improves dual-domain fusion performance. Experimental results show our method increases average precision by 2.41% at the hard difficulty level for cars on the KITTI test set. On the KITTI validation set, mean average precision improves by 1.06% for cars, 6.88% for pedestrians, and 1.83% for cyclists. (c) 2024 SPIE and IS&T

本地馆藏 | 借阅须知 | 我要预约

已订购，未入库

sda

目录详情 | 试阅读 |

读者评论与其他读者分享你的观点

学校读者

用户名:未登录

我的评分

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

Dual-domain deformable feature fusion for multi-modal 3D object detection

读者评论与其他读者分享你的观点

请选择收藏分类：

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

Dual-domain deformable feature fusion for multi-modal 3D object detection

读者评论 与其他读者分享你的观点

请选择收藏分类： 新增自定义分类 确定 取消

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

读者评论与其他读者分享你的观点

请选择收藏分类：