The attention mechanism plays a pivotal role in designing advanced super-resolution (SR) networks. In this work, we design an efficient SR network by improving the attention mechanism. We start from a simple pixel att...
详细信息
The multi-modality sensor fusion technique is an active research area in scene understating. In this work, we explore the RGB image and semantic-map fusion methods for depth estimation. The LiDARs, Kinect, and TOF dep...
详细信息
ISBN:
(数字)9781728163956
ISBN:
(纸本)9781728163963
The multi-modality sensor fusion technique is an active research area in scene understating. In this work, we explore the RGB image and semantic-map fusion methods for depth estimation. The LiDARs, Kinect, and TOF depth sensors are unable to predict the depth-map at illuminate and monotonous pattern surface. In this paper, we propose a semantic-to-depth generative adversarial network (S2D-GAN) for depth estimation from RGB image and its semantic-map. In the first stage, the proposed S2D-GAN estimates the coarse level depthmap using a semantic-to-coarse-depth generative adversarial network (S2CD-GAN) while the second stage estimates the fine-level depth-map using a cascaded multi-scale spatial pooling network. The experimental analysis of the proposed S2D-GAN performed on NYU-Depth-V2 dataset shows that the proposed S2D-GAN gives outstanding result over existing single image depth estimation and RGB with sparse samples methods. The proposed S2D-GAN also gives efficient results on the real-world indoor and outdoor image depth estimation.
Recent advances in single image super-resolution (SISR) have achieved extraordinary performance, but the computational cost is too heavy to apply in edge devices. To alleviate this problem, many novel and effective so...
详细信息
The detection and removal of precancerous polyps through colonoscopy is the primary technique for the prevention of colorectal cancer worldwide. However, the miss rate of colorectal polyp varies significantly among th...
详细信息
This paper presents a systematic literature review of image datasets for document image analysis, focusing on historical documents, such as handwritten manuscripts and early prints. Finding appropriate datasets for hi...
详细信息
Existing Multimodal Large Language Models (MLLMs) encounter significant challenges in modeling the temporal context within long videos. Currently, mainstream Agent-based methods use external tools (e.g., search engine...
详细信息
Transformer-based methods have shown impressive performance in low-level vision tasks, such as image super-resolution. However, we find that these networks can only utilize a limited spatial range of input information...
详细信息
Under-Display Camera (UDC) has been widely exploited to help smartphones realize full-screen displays. However, as the screen could inevitably affect the light propagation process, the images captured by the UDC syste...
详细信息
Generative Adversarial Networks (GAN) have demonstrated the potential to recover realistic details for single image super-resolution (SISR). To further improve the visual quality of super-resolved results, PIRM2018-SR...
详细信息
Tiny Actions Challenge focuses on understanding human activities in real-world surveillance. Basically, there are two main difficulties for activity recognition in this scenario. First, human activities are often reco...
详细信息
暂无评论