Land use classification using optical and Synthetic Aperture Radar (SAR) images is a crucial task in remotesensingimage interpretation. Recently, deep multi-modal fusion models have significantly enhanced land use c...
详细信息
ISBN:
(纸本)9789819985487;9789819985494
Land use classification using optical and Synthetic Aperture Radar (SAR) images is a crucial task in remotesensingimage interpretation. Recently, deep multi-modal fusion models have significantly enhanced land use classification by integrating multi-source data. However, existing approaches solely rely on simple fusion methods to leverage the complementary information from each modality, disregarding the intermodal correlation during the feature extraction process, which leads to inadequate integration of the complementary information. In this paper, we propose FASONet, a novel multi-modal fusion network consisting of two key modules that tackle this challenge from different perspectives. Firstly, the feature alignment module (FAM) facilitates cross-modal learning by aligning high-level features from both modalities, thereby enhancing the feature representation for each modality. Secondly, we introduce the multi-modal squeeze and excitation fusion module (MSEM) to adaptively fuse discriminative features by weighting each modality and removing irrelevant parts. Our experimental results on the WHU-OPT-SAR dataset demonstrate the superiority of FASONet over other fusion-based methods, exhibiting a remarkable 5.1% improvement in MIoU compared to the state-of-the-art MCANet method.
The Internet of Things (IoT) provides a collaborative infrastructure to communicate smart devices with cloud-edge healthcare applications, medical devices, wearable biosensors, etc. On the other hand, crowd counting a...
详细信息
The Internet of Things (IoT) provides a collaborative infrastructure to communicate smart devices with cloud-edge healthcare applications, medical devices, wearable biosensors, etc. On the other hand, crowd counting as one of computer vision approaches is an emerging topic to detect any objects with static or dynamic mobility in the IoT environments. Smart crowd counting enables patternrecognition for many intelligent applications such as microbiology, surveillance, healthcare systems, crowdedness estimation, and other environmental case studies. According to complicated capturing systems in the IoT environments, crowd counting methods can influence on performance of object detection in the critical case studies using Artificial Intelligence (AI)-based approaches such as machine learning, deep learning, collaborative learning, fuzzy logic and meta-heuristic algorithms. This paper provides a new comprehensive technical analysis for existing AI-based crowd counting approaches in healthcare and medical systems, biotechnology and IoT environments. Meanwhile, it presents a discussion on the existing case studies with respect to analyzing technical aspects and applied algorithms to enhance pattern prediction factors. Finally, some new innovative efforts and challenges are presented for new research upcoming and open issues.
With the development of networks, many fields now demand higher quality in specific image areas, such as main characters in photos, lesion areas in medical images, and features in remotesensing. At the same time, the...
详细信息
Yield forecasting has been a central task in computational agriculture because of its impact on agricultural management from the individual farmer to the government level. With advances in remotesensing technology, c...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
Yield forecasting has been a central task in computational agriculture because of its impact on agricultural management from the individual farmer to the government level. With advances in remotesensing technology, computational processing power, and machine learning, the ability to forecast yield has improved substantially over the past years. However, most previous work has been done leveraging low-resolution satellite imagery and forecasting yield at the region, county, or occasionally farm-level. In this work, we use high-resolution aerial imagery and output from high-precision harvesters to predict in-field harvest values for corn-raising farms in the US Midwest. By using the harvester information, we are able to cast the problem of yield-forecasting as a density estimation problem and predict a harvest rate, in bushels/acre, at every pixel in the field image. This approach provides the farmer with a detailed view of which areas of the farm may be performing poorly so he can make the appropriate management decisions in addition to providing an improved prediction of total yield. We evaluate both traditional machine learning approaches with hand-crafted features alongside deep learning methods. We demonstrate the superiority of our pixel-level approach based on an encoder-decoder framework which produces a 5.41% MAPE at the field-level.
The proceedings contain 593 papers. The topics discussed include: MDBFUSION: a visible and infrared image fusion framework capable for motion deblurring;prune channel and distill: discriminative knowledge distillation...
ISBN:
(纸本)9798350349399
The proceedings contain 593 papers. The topics discussed include: MDBFUSION: a visible and infrared image fusion framework capable for motion deblurring;prune channel and distill: discriminative knowledge distillation for semantic segmentation;imbalanced data robust online continual learning based on evolving class aware memory selection and built-in contrastive representation learning;privacy-preserving visual cues communication for hearing-impaired people using deep learning;transformer-based clipped contrastive quantization learning for unsupervised image retrieval;attention enhancement with parallel groups for remotesensing object detection;cross-domain few-shot in-context learning for enhancing traffic sign recognition;and recurrent 3-D multi-level visual transformer for joint classification of heterogeneous 2-D and 3-D radiographic data.
With the development of generative adversarial networks, the super -resolution technique of reconstructing a high -resolution image from a low -resolution has achieved excellent resolution results. However, small, low...
详细信息
With the development of generative adversarial networks, the super -resolution technique of reconstructing a high -resolution image from a low -resolution has achieved excellent resolution results. However, small, low resolution images are widespread, such as images taken by a thermal camera or with a lens far from the target. Extremely small target image super -resolution is a challenging problem. The main reason is that the small infrared target has fewer pixels and weaker features. The current optimization methods for the tiny target are mainly based on multi -scale feature fusion or super -resolution enhancement. The low -resolution images characterizing small targets are usually obtained by down sampling with high -resolution images during training, which is different from the style of the tiny target in actual detection applications, resulting in poor resolution. In order to solve the problem, we propose a new resolution network: Style Transformation Super -Resolution Generative Adversarial Network (STSRGAN). It contains two sub -networks: one is style transformation GAN to convert the style of the image, and the other is super -resolution GAN. STSRGAN transforms a blurry infrared small target into a clear target with a distribution similar to the training set. Then the resolution can be increased to get a better enhancement effect. The discriminator distinguishes whether the input comes from the generator or the actual image to assist in generating a better super -resolution image. Meanwhile, we produced an infrared Unmanned Aerial Vehicle (UAV) small target dataset with target pixels below 16 x 16. Our method proves better resolution enhancement of small IR targets and shows superior performance over other methods through experiments.
The proceedings contain 52 papers. The topics discussed include: improvement of remotesensingimage target detection algorithm based on YOLO V5;A Study of Chan-Vese model with the introduction of edge information;rea...
The proceedings contain 52 papers. The topics discussed include: improvement of remotesensingimage target detection algorithm based on YOLO V5;A Study of Chan-Vese model with the introduction of edge information;real-time monitoring algorithm of muscle state based on sEMG signal;lane detection network with direction context;anomaly pixel detection via dual-branch uncertainty metrics;high precision license plate recognition algorithm in open scene;implementation and design of metro process quality inspection system based on imageprocessing technology;the research on remotesensingimage change detection based on deep learning;research on aircraft wheel hub pose detection method based on machine vision;lunar dome detection method based on few-shot object detection;and image enhancement algorithm of foggy sky with sky based on sky segmentation.
Geospatial semantic segmentation on remotesensingimages suffers from large intra-class variance in both foreground and background classes. First, foreground objects are tiny in the remotesensingimages and are repr...
详细信息
ISBN:
(数字)9781665469463
ISBN:
(纸本)9781665469463
Geospatial semantic segmentation on remotesensingimages suffers from large intra-class variance in both foreground and background classes. First, foreground objects are tiny in the remotesensingimages and are represented by only a few pixels, which leads to large foreground intraclass variance and undermines the discrimination between foreground classes (issue firstly considered in this work). Second, background class contains complex context, which results in false alarms due to large background intra-class variance. To alleviate these two issues, we construct a sparse and complete latent structure via prototypes. In particular, to enhance the sparsity of the latent space, we design a prototypical contrastive learning to have prototypes of the same category clustering together and prototypes of different categories to be far away from each other. Also, we strengthen the completeness of the latent space by modeling all foreground categories and hardest (nearest) background objects. We further design a patch shuffle augmentation for remotesensingimages with complicated contexts. Our augmentation encourages the semantic information of an object to he correlated only to the limited context within the patch that is specific to its category, which further reduces large intra-class variance. We conduct extensive evaluations on a large scale remotesensing dataset, showing our approach significantly outperforms state-of-the-art methods by a large margin.
Stereo estimation has made many advancements in recent years with the introduction of deep-learning. However the traditional supervised approach to deep-learning requires the creation of accurate and plentiful ground-...
详细信息
Jose Manuel Bioucas-Dias was an outstanding expert in many different IEEE-related areas, including inverse problems in imaging, signal and imageprocessing, patternrecognition, optimization, and remotesensing. He au...
详细信息
ISBN:
(纸本)9781665403696
Jose Manuel Bioucas-Dias was an outstanding expert in many different IEEE-related areas, including inverse problems in imaging, signal and imageprocessing, patternrecognition, optimization, and remotesensing. He authored or co-authored more than 250 publications, including more than 100 journal papers (66 of which published in IEEE journals) and over 200 peer-reviewed international conference papers and book chapters. His contributions have been extremely influential in many different fields, namely phase estimation and unwrapping, convex optimization and Bayesian inference for imaging inverse problems, with a special emphasis on remotesensing, including synthetic aperture radar (SAR), hyperspectral unmixing, fusion, superresolution, classification, and segmentation. In this paper, we provide an overview of his outstanding contributions to remotesensingimageprocessing.
暂无评论