Most image dehazing deep learning models target synthetic datasets of hazy images, resulting in not considering features in natural hazy images. Leveraging on depth attention with adaptation, we propose a novel dehazi...
详细信息
Single image deraining is an important problem in many computer vision tasks because rain streaks can severely degrade the image quality. Recently, deep convolution neural network (CNN) based single image deraining me...
详细信息
ISBN:
(纸本)9781665475921
Single image deraining is an important problem in many computer vision tasks because rain streaks can severely degrade the image quality. Recently, deep convolution neural network (CNN) based single image deraining methods have been developed with encouraging performance. However, most of these algorithms are designed by stacking convolutional layers, which encounter obstacles in learning abstract feature representation effectively and can only obtain limited features in the local region. In this paper, we propose a recurrent multi-connection fusion network (RMCFN) to remove rain streaks from single images. Specifically, the RMCFN employs two key components and multiple connections to fully utilize and transfer features. Firstly, we use a multi-scale fusion memory block (MFMB) to exploit multi-scale features and obtain long-range dependencies, which is beneficial to feed useful information to a later stage. Moreover, to efficiently capture the informative features on the transmission, we fuse the features of different levels and employ a multi-connection manner to use the information within and between stages. Finally, we develop a dual attention enhancement block (DAEB) to explore the valuable channel and spatial components and only pass further useful features. Extensive experiments verify the superiority of our method in visual effect and quantitative results compared to the state-of-the-arts.
The leading cause of visual impairment after cataract, is glaucoma and the only way to combat it is to detect it early. It is imperative to develop a system that can work effectively without a lot of equipment, qualif...
详细信息
Nowadays, learned image compression has outperformed traditional coding methods like VVC. However, learned image compression methods are often optimized for specific rates and lack support for variable rate compressio...
详细信息
Inspection of aircraft skin is required as per the Corrosion Prevention and Control Program (CPCP) to ensure aircraft structural integrity. Human visual inspection is the most widely used technique in aircraft surface...
详细信息
As agriculture plays vital role in nation's economy, early diagnosis of plant diseases is a crucial and challenging task. Soybean and Cotton are the cash crops in Maharashtra therefore this study focuses on these ...
详细信息
ISBN:
(纸本)9783031640698;9783031640704
As agriculture plays vital role in nation's economy, early diagnosis of plant diseases is a crucial and challenging task. Soybean and Cotton are the cash crops in Maharashtra therefore this study focuses on these crops. Automatic disease recognition and classification poses a variety of challenges in consideration of available datasets, tools, and image capturing conditions and has received considerable attention in the past few decades. Diseases have proven to be the root cause to major losses in the production of crop and low-quality yield. Within this scenario, the automatic disease recognition and classification is very critical and primary challenge for sustainable farming. Traditional methods were manual which are prone to errors, time-consuming and costly. In recent years Deep learning along with imageprocessing has garnered tremendous success in a variety of application domains including automatic disease detection, but, traditional methods were not focusing on multiple diseases available on single leaf image. In this investigation, imageprocessing techniques are investigated for their potential application in identifying cotton and soybean leaf diseases. The study evaluates the accuracy and loss of two approaches, Inception-visual Geometry Group Network (INC-VGGN) and FACED, following various training epochs. FACED shows superior performance than INC-VGGN for both cotton and soybean plants in terms of training and validation accuracy. FACED falls behind INC-VGGN in training accuracy at first, but at the end of 30 epochs, it has caught up and even surpassed it for soybean. FACAD gives high precision and recall over INC-VGGN.
image-text retrieval is a complicated and challenging task in the cross-modality area, and lots of experiments have made great progress. Most existing researches process images and text in one pipeline or are highly e...
详细信息
ISBN:
(数字)9789819916450
ISBN:
(纸本)9789819916443;9789819916450
image-text retrieval is a complicated and challenging task in the cross-modality area, and lots of experiments have made great progress. Most existing researches process images and text in one pipeline or are highly entangled, which is not practical and human-friendly in the real world. Moreover, the image regions extracted by Faster-RCNN are highly over-sampled in the image pipeline, which causes ambiguities for the extracted visual embeddings. From this point of view, we introduce the Bottom-up Transformer Reasoning Network (BTRN). Our method is built upon the transformer encoders to process the image and text separately. We also embed the tag information generated by Faster-RCNN to strengthen the connection between the two modalities. Recall at K and normalized discounted cumulative gain metric (NDCG) metrics are used to evaluate our model. Through various experiments, we prove our model can reach state-of-the-art results.
Large pretrained models applied to image data have showcased significant progress in visual representation learning. However, the direct application of pretrained image models to video data is insufficient for capturi...
详细信息
This paper presents an innovative pipeline design for the quantization of neural image codecs, ensuring robust and consistent deployment across various platform inferences. Our approach addresses a significant challen...
详细信息
Multi-modal image registration is a critical step in many remote sensing and visual navigation applications. While image registration techniques developed for single modality images do not perform well for multi-modal...
详细信息
ISBN:
(数字)9781665450928
ISBN:
(纸本)9781665450928
Multi-modal image registration is a critical step in many remote sensing and visual navigation applications. While image registration techniques developed for single modality images do not perform well for multi-modal registration, techniques developed for multi-modal image registration are not suitable for real-time applications. In this study, we aim to achieve real-time registration of infrared and optical (visible range) images. Specifically, we investigate the use of image-to-image translation deep network to convert infrared images to optical images, which is then applied for keypoint-based image registration.
暂无评论