Deep learning has achieved great success in computervision, especially in image classification tasks. How to improve the generalization ability and compactness of deep neural networks has gradually attracted widespre...
详细信息
Deep learning has achieved great success in computervision, especially in image classification tasks. How to improve the generalization ability and compactness of deep neural networks has gradually attracted widespread attention from researchers. Knowledge distillation is an effective technique for model compression. It transfers general knowledge from a sophisticated teacher model to a smaller student model. Recently, some studies refine knowledge from feature maps or adopt complex attention mechanisms to better supervise students imitating teachers. However, their methods focus too much on how to improve students' accuracy and largely overlook the associated training costs, which violates the original intention of knowledge distillation to compress the model. To achieve a balance between performance and efficiency, in this paper, we introduce a straightforward and effective distillation method to utilize the deepest feature maps to enhance shallow features. Specifically, our method performs processing only on the original feature maps without an extra assisting network. Moreover, we use cross-layer feature fusion to enhance the attention on shallow feature maps. By visualizing the features of different layers, we demonstrate the importance of the fusion operation in our method. Our experimental results on the CIFAR-100, tinyimageNet and miniimageNet datasets show that our approach outperforms previous methods, especially in the balance between performance and training cost. Further ablative studies verify the effectiveness of the design.
Enhancing the quality of low light images is a critical area of research, and the recent advancements in this field offer significant potential for enhancing the standard of low light images and their subsequent proce...
详细信息
Cancer detection through medical image segmentation and classification is possible owing to the advancement in imageprocessing techniques. Segmentation and classification tasks carried out to predict and classify dis...
详细信息
Person re-identification is a cross-view pedestrian tracking and retrieval technology, which is of great significance in the field of security monitoring. Due to the different conditions of the shooting scene, there w...
详细信息
This paper presents an innovative architecture based on a Cycle Generative Adversarial Network (CycleGAN) for the synthesis of high-quality depth maps from monocular images. The proposed architecture leverages a diver...
详细信息
The proceedings contain 33 papers. The topics discussed include: high precious automatic balanced homodyne detector for quantum information processing based on proportional-integral-derivative;an underwater image enha...
ISBN:
(纸本)9781510672529
The proceedings contain 33 papers. The topics discussed include: high precious automatic balanced homodyne detector for quantum information processing based on proportional-integral-derivative;an underwater image enhancement method based on SWIN transformer;image enhancement and edge detection for defect identification using infrared thermal wave radar imaging;DNU-Net for infrared small target detection;DDCU-Net: dual dynamic convolutional U-Net for infrared small-target detection;a broadband deconvolution beamforming acceleration method;telephoto camera calibration based on robust homography matrix;research on surface defect detection of aerospace electronic components based on machine vision;and multi-channel optical module based on PLCC packaging.
Nowadays, image recognition plays a pivotal role in acquiring data via sensors. However, the adaptability of traditional algorithms is hindered by the unpredictable nature of open environments, varying sensor quality,...
详细信息
Health care is a vital service that is constantly in high demand since everyone needs it. Individuals have higher hopes for advancement in this profession than for receiving the status quo since they would rather be t...
详细信息
In today’s era of cloud computing, modification and tampering of digital images on cloud storage have turn out to be easier due to proliferation of digital imageprocessing tools. Consequently, tamper detection and i...
详细信息
A key contributor to recent progress in 3D detection from single images is monocular depth estimation. Existing methods focus on how to leverage depth explicitly, by generating pseudo-pointclouds or providing attentio...
详细信息
ISBN:
(纸本)9798350323658
A key contributor to recent progress in 3D detection from single images is monocular depth estimation. Existing methods focus on how to leverage depth explicitly, by generating pseudo-pointclouds or providing attention cues for image features. More recent works leverage depth prediction as a pretraining task and fine-tune the depth representation while training it for 3D detection. However, the adaptation is limited in scale by manual labels. In this work, we propose further aligning the depth representation with the target domain in an unsupervised fashion. Our methods leverage commonly available LiDAR or RGB videos during training time to fine-tune the depth representation, which leads to improved 3D detectors. Especially when using RGB videos, we show that our two-stage training by first generating depth pseudo-labels is critical, because of the inconsistency in loss distribution between the two tasks. With either type of reference data, our multi-task learning approach improves over the state of the art on both KITTI and NuScenes, while matching the test-time complexity of its single-task sub-network. Source code and pretrained models are available on https://***/TRI-ML/DD3D.
暂无评论