In computer vision, Monocular depth estimation is an important topic. Recently the CNNs (Convolutional Neural Networks) based model shows a reasonable result from an end-to-end encoder-decoder architecture. In our pri...
详细信息
ISBN:
(纸本)9798350386851;9798350386844
In computer vision, Monocular depth estimation is an important topic. Recently the CNNs (Convolutional Neural Networks) based model shows a reasonable result from an end-to-end encoder-decoder architecture. In our prior experiment, non-local decoder-squeeze-and-excitation (NL-DSE) [1] was proposed. NL-DSE is based on an Efficient-Net-B5 encoder network, but the algorithmic complexity is still high. In this paper, we aim to achieve lightweight depth estimation. To accomplish this, we replace Efficient-Net-B5 with different encoder networks and compare the performance of the modules. We evaluate the accuracy of each module on the NYU Depth V2 dataset and use Nvidia AGX Xavier as our edge device to get FLOP and frame rate. Finally, we select Efficient-Net-B0 as the encoder network to achieve the lightweight monocular depth estimation.
暂无评论