The paper presents an end-to-end approach that leverages images for estimating an ordered list of 3d key-points. Most of the existing methods either use point clouds or multiple RGB/depth images to estimate 3dkey-poi...
详细信息
ISBN:
(纸本)9783031064302;9783031064296
The paper presents an end-to-end approach that leverages images for estimating an ordered list of 3d key-points. Most of the existing methods either use point clouds or multiple RGB/depth images to estimate 3d key-points, whereas the proposed approach requires only a single-view RGB image. It is based on three steps: extracting latent codes, computing pixel-wise features, and estimating 3d key-points. It also computes a confidence score of every key-point that enables it to predict a different number of key-points based on an object's shape. Therefore, unlike existing approaches, the network can be trained to address several categories at once. For evaluation, we first estimate 3d key-points for two views of an object and then use them for finding a relative pose between the views. The results show that the average angular distance error of our approach (6.39 degrees) is 8.01 degrees lower than that of KP-Net (14.40 degrees) [1].
暂无评论