This paper aims at achieving fine-grained building attribute segmentation in a cross-view scenario, i.e., using satellite and street-view image pairs. The main challenge lies in overcoming the significant perspective ...
详细信息
ISBN:
(纸本)9798350353006
This paper aims at achieving fine-grained building attribute segmentation in a cross-view scenario, i.e., using satellite and street-view image pairs. The main challenge lies in overcoming the significant perspective differences between street views and satellite views. In this work, we introduce SG-BEV, a novel approach for satellite-guided BEV fusion for cross-view semantic segmentation. To overcome the limitations of existing cross-view projection methods in capturing the complete building facade features, we innovatively incorporate Bird's Eye View (BEV) method to establish a spatially explicit mapping of street-view features. Moreover, we fully leverage the advantages of multiple perspectives by introducing a novel satellite-guided reprojection module, optimizing the uneven feature distribution issues associated with traditional BEV methods. Our method demonstrates significant improvements on four cross-view datasets collected from multiple cities, including New York, San Francisco, and Boston. On average across these datasets, our method achieves an increase in mIOU by 10.13% and 5.21% compared with the state-of-the-art satellite-based and cross-view methods. The code and datasets of this work will be released at https: //***/yejy53/SG-BEV.
This paper deals with the problem of video-based face recognition. Nowadays, facial recognition methods have made a big step forward, but video-based recognition with its poor quality, difficult lighting conditions, a...
详细信息
This paper deals with the problem of video-based face recognition. Nowadays, facial recognition methods have made a big step forward, but video-based recognition with its poor quality, difficult lighting conditions, and real-time requirements is still a difficult and unfinished *** paper uses the apparatus of convolutional networks for various stages of processing: for capturing and detecting a face, for constructing a feature vector, and finally for recognition. All algorithms are implemented and studied in the Matlab environment to simplify their further export to embedded applications.
With the development of remotesensing technology, remotesensingimages of buildings are of great significance in urban planning, disaster response, and other directions. When we use a neural network containing batch...
详细信息
Unmanned aerial vehicle remotesensingimages suffer from problems such as arbitrary object orientation and dense arrangement of small targets, which makes horizontal box object detection difficult. To address these i...
详细信息
Computational topology has consequently shorten the time taken for imagerecognition with good accuracy and therefore has boosted the performance of computer vision. This paper uses computational topology in different...
详细信息
In recent years, the success of large-scale vision-language models (VLMs) such as CLIP has led to their increased usage in various computer vision tasks. These models enable zero-shot inference through carefully craft...
详细信息
In recent years, the research on remotesensingimage information output has developed rapidly, including not only remotesensing technology, imageprocessing, patternrecognition, etc., but also land use, environment...
详细信息
Graph Convolutional Network (GCN) has emerged as a new technique for hyperspectral image (HSI) classification. However, in current GCN-based methods, the graphs are usually constructed with manual effort and thus is s...
详细信息
Graph Convolutional Network (GCN) has emerged as a new technique for hyperspectral image (HSI) classification. However, in current GCN-based methods, the graphs are usually constructed with manual effort and thus is separate from the classification task, which could limit the representation power of GCN. Moreover, the employed graphs often fail to encode the global contextual information in HSI. Hence, we propose a Multi-level Graph Learning Network (MGLN) for HSI classification, where the graph structural information at both local and global levels can be learned in an end-to-end fashion. First, MGLN employs attention mechanism to adaptively characterize the spatial relevance among image regions. Then localized feature representations can be produced and further used to encode the global contextual information. Finally, prediction can be acquired with the help of both local and global contextual information. Experiments on three real-world hyperspectral datasets reveal the superiority of our MGLN when compared with the state-of-the-art methods. (c) 2022 Elsevier Ltd. All rights reserved.
Poleward moving auroral forms (PMAFs) are a common dayside auroral phenomenon, and the study of PMAFs has important implications for the exploration of the near-earth space physical processes for geosciences. In the a...
详细信息
Poleward moving auroral forms (PMAFs) are a common dayside auroral phenomenon, and the study of PMAFs has important implications for the exploration of the near-earth space physical processes for geosciences. In the all-sky imager (ASI) image sequence, PMAFs show a tendency to move northward in the northern hemisphere. Therefore, this particular motion pattern can be used for PMAF recognition. Previous works for automatic recognition of PMAFs tend to rely on optical flow. However, both the traditional and the deep learning-based optical flow estimation methods are time- and memory-expensive. In view of the large number of auroral images generated every year, it is impractical to estimate the optical flow for all auroral data with limited computational resources. In this letter, a poleward-motion aware network (PA-Net) is proposed to extract the motion features directly from ASI images. PA-Net computes the correlation between each point in an image and the points at the poleward direction in the following image by means of a poleward-motion aware operation (PA-Operation), to verify whether the point under consideration has undergone poleward motion. In addition, a channel attention mechanism is applied to the features obtained by PA-Operation to suppress information less helpful for recognizing PMAFs. The PA-Net achieves the best performance on the PMAFs recognition dataset over other commonly used action recognition models, validating the superiority of our approach. More importantly, the complicated optical flow estimation is avoided, making it possible to apply the proposed method to large-scale auroral data.
Urban subsurface infrastructures, e.g., pipelines and roads, are aging with the expansion of modern cities. Benefiting from the capability of nondestructive detection, ground penetrating radar (GPR) has been widely ap...
详细信息
Urban subsurface infrastructures, e.g., pipelines and roads, are aging with the expansion of modern cities. Benefiting from the capability of nondestructive detection, ground penetrating radar (GPR) has been widely applied to underground objects or disasters detection, and GPR B-scan images are employed by manual interpretation. This way of high subjectivity and uncertainty inevitably results in failure of detection. Meanwhile, the shortage of labeled images greatly impedes the automatization and intelligentization of underground disaster detection based on GPR. Many data simulation techniques, e.g., forward modeling, were used to augment images for training;however, the generated forward images were not similar enough to the real B-scan data, which makes recognition a challenging task. To address this problem, we proposed a novel B-scan image simulation method based on a generative adversarial network to generate synthetic images for training detection networks. Our network utilizes DenseNet as the backbone network of the generator to extract image features, and a weighted total variation regularization term to regularize the loss function of the network. The comparison and ablation experiments verified that our network could generate simulation images with high similarity to real GPR B-scan images. We believe that this work contributes to the intelligent processing and analysis of GPR data and improves the efficiency of underground disaster detection.
暂无评论