Dropout is designed to relieve the overfitting problem in high-level vision tasks but is rarely applied in low-level vision tasks, like image super-resolution (SR). As a classic regression problem, SR exhibits a diffe...
详细信息
Interactive image restoration aims to generate restored images by adjusting a controlling coefficient which determines the restoration level. Previous works are restricted in modulating image with a single coefficient...
详细信息
Learning spatial-temporal relation among multiple actors is crucial for group activity recognition. Different group activities often show the diversified interactions between actors in the video. Hence, it is often di...
详细信息
Graph Convolution Network (GCN) has been successfully used for 3D human pose estimation in videos. However, it is often built on the fixed human-joint affinity, according to human skeleton. This may reduce adaptation ...
详细信息
Human-Object interaction (HOI) detection aims to localize and infer relationships between human and objects in an image. It is challenging because an enormous number of possible combinations of objects and verbs types...
详细信息
Recent studies often exploit Graph Convolutional Network (GCN) to model lab.l dependencies to improve recognition accuracy for multi-lab.l image recognition. However, constructing a graph by counting the lab.l co-occu...
详细信息
With the prosperity of digital video industry, video frame interpolation has arisen continuous attention in computervision community and become a new upsurge in industry. Many learning-based methods have been propose...
详细信息
Self-supervised Multi-view stereo (MVS) with a pretext task of image reconstruction has achieved significant progress recently. However, previous methods are built upon intuitions, lacking comprehensive explanations a...
详细信息
ISBN:
(纸本)9781665428132
Self-supervised Multi-view stereo (MVS) with a pretext task of image reconstruction has achieved significant progress recently. However, previous methods are built upon intuitions, lacking comprehensive explanations about the effectiveness of the pretext task in self-supervised MVS. To this end, we propose to estimate epistemic uncertainty in self-supervised MVS, accounting for what the model ignores. Specially, the limitations can be categorized into two types: ambiguious supervision in foreground and invalid supervision in background. To address these issues, we propose a novel Uncertainty reduction Multi-view Stereo (U-MVS) framework for self-supervised learning. To alleviate ambiguous supervision in foreground, we involve extra correspondence prior with a flow-depth consistency loss. The dense 2D correspondence of optical flows is used to regularize the 3D stereo correspondence in MVS. To handle the invalid supervision in background, we use Monte-Carlo Dropout to acquire the uncertainty map and further filter the unreliable supervision signals on invalid regions. Extensive experiments on DTU and Tank&Temples benchmark show that our U-MVS framework 1 achieves the best performance among unsupervised MVS methods, with competitive performance with its supervised opponents.
Recent studies have witnessed that self-supervised methods based on view synthesis obtain clear progress on multiview stereo (MVS). However, existing methods rely on the assumption that the corresponding points among ...
详细信息
Self-supervised Multi-view stereo (MVS) with a pretext task of image reconstruction has achieved significant progress recently. However, previous methods are built upon intuitions, lacking comprehensive explanations a...
详细信息
暂无评论