作者:
Zhou, BeibeiXie, JinJin, ZhongKong, HuiNanjing Univ Sci & Technol
Sch Comp Sci & Engn PCA Lab Key Lab Intelligent Percept & Syst High Dimens Inf Nanjing 210094 Jiangsu Peoples R China Nanjing Univ Sci & Technol
Sch Comp Sci & Engn Jiangsu Key Laboratoryof Image & Video Understandi Nanjing 210094 Jiangsu Peoples R China Univ Macau
Dept Electromech Engn EME State Key Lab Internet Things Smart City SKL IOTSC Macau Peoples R China
Deep neural networks have been shown to be effective for unsupervised monocularvisualodometry that can predict the camera's ego-motion based on an input of monocular video sequence. However, most existing unsupe...
详细信息
Deep neural networks have been shown to be effective for unsupervised monocularvisualodometry that can predict the camera's ego-motion based on an input of monocular video sequence. However, most existing unsupervised monocular methods haven't fully exploited the extracted information from both local geometric structure and visual appearance of the scenes, resulting in degraded performance. In this paper, a novel geometry-aware network is proposed to predict the camera's ego-motion by learning representations in both 2D and 3D space. First, to extract geometry-aware features, we design an RGB-PointCloud feature fusion module to capture information from both geometric structure and the visual appearance of the scenes by fusing local geometric features from depth-map-derived point clouds and visual features from RGB images. Furthermore, the fusion module can adaptively allocate different weights to the two types of features to emphasize important regions. Then, we devise a relevant feature filtering module to build consistency between the two views and preserve informative features with high relevance. It can capture the correlation of frame pairs in the feature-embedding space by attention mechanisms. Finally, the obtained features are fed into the pose estimator to recover the 6-DoF poses of the camera. Extensive experiments show that our method achieves promising results among the unsupervised monocular deep learning methods on the KITTI odometry and TUM-RGBD datasets.
暂无评论