Fusion of low-resolution hyperspectral image (LR-HSI) and high-resolution multispectral image (HR-MSI) has become an effective technique for HSI super-resolution. Deep leraning based fusion methods have achieved signi...
详细信息
ISBN:
(纸本)9789819784929;9789819784936
Fusion of low-resolution hyperspectral image (LR-HSI) and high-resolution multispectral image (HR-MSI) has become an effective technique for HSI super-resolution. Deep leraning based fusion methods have achieved significant success in the fields. However, they often show limited ability to capture the complex spatial and spectral information of HSI, resulting in the loss of details. In this paper, we develope a novel multi-scale feature fusion based network (MSFNet) for HSI super-resolution, which consists of a multi-scale feature extraction block and a multi-scale feature fusion block. In the former block, we fully consider the spatial and spectral correlations and develop two modules, i.e., global-local attention and channel self-attention, to capture the complex structure of HSI at different scales. In the fusion stage, we adopt a U-Net like architecture to gradually fuse the extracted multi-scale features, resulting in restored HSIs at different scales. We also develop a new loss function to train the proposed neural network by minmizing the restoration errors at different scales both in the raw domain and the frequency domain, which facilitates to preserve the high-frequency details. Our experimental results demonstrate that the proposed model outperforms the state-of-the-art.
This paper aims at achieving fine-grained building attribute segmentation in a cross-view scenario, i.e., using satellite and street-view image pairs. The main challenge lies in overcoming the significant perspective ...
详细信息
ISBN:
(纸本)9798350353006
This paper aims at achieving fine-grained building attribute segmentation in a cross-view scenario, i.e., using satellite and street-view image pairs. The main challenge lies in overcoming the significant perspective differences between street views and satellite views. In this work, we introduce SG-BEV, a novel approach for satellite-guided BEV fusion for cross-view semantic segmentation. To overcome the limitations of existing cross-view projection methods in capturing the complete building facade features, we innovatively incorporate Bird's Eye View (BEV) method to establish a spatially explicit mapping of street-view features. Moreover, we fully leverage the advantages of multiple perspectives by introducing a novel satellite-guided reprojection module, optimizing the uneven feature distribution issues associated with traditional BEV methods. Our method demonstrates significant improvements on four cross-view datasets collected from multiple cities, including New York, San Francisco, and Boston. On average across these datasets, our method achieves an increase in mIOU by 10.13% and 5.21% compared with the state-of-the-art satellite-based and cross-view methods. The code and datasets of this work will be released at https: //***/yejy53/SG-BEV.
This paper deals with the problem of video-based face recognition. Nowadays, facial recognition methods have made a big step forward, but video-based recognition with its poor quality, difficult lighting conditions, a...
详细信息
This paper deals with the problem of video-based face recognition. Nowadays, facial recognition methods have made a big step forward, but video-based recognition with its poor quality, difficult lighting conditions, and real-time requirements is still a difficult and unfinished *** paper uses the apparatus of convolutional networks for various stages of processing: for capturing and detecting a face, for constructing a feature vector, and finally for recognition. All algorithms are implemented and studied in the Matlab environment to simplify their further export to embedded applications.
With the development of remotesensing technology, remotesensingimages of buildings are of great significance in urban planning, disaster response, and other directions. When we use a neural network containing batch...
详细信息
Wind velocities approximated via velocity azimuth display (VAD) have been found to be contaminated by birds and enabled by insects. However, the widely used VAD wind profile (VWP) does not account for taxa, largely du...
详细信息
Wind velocities approximated via velocity azimuth display (VAD) have been found to be contaminated by birds and enabled by insects. However, the widely used VAD wind profile (VWP) does not account for taxa, largely due to the challenge of distinguishing bird from insect echoes. This problem has been addressed by the recently developed bird-insect ridge classifier (BIRC). Hence, this work proposes an intelligent VAD (IVAD) that leverages BIRC to improve clear-air wind estimates by generating three new products, including the insects-birds ratio, bird-only VAD, and insect-only VAD. These products are analyzed for one-month periods containing nocturnal and diurnal bird migration. Wind bias is used as the evaluation metric, defined as the deviation of the predicted VAD from reference wind measurements obtained from the rapid refresh (RAP) model. Results show an inverse relationship between biases and the insects-birds ratio, such that increasing (decreasing) bird (insect) population was accompanied by larger biases. Furthermore, contaminated VADs showed improvements when insects-only signals were used instead of all biological echoes. We recommend that these products can be incorporated into the VWP. First, the insects-birds ratio can be used to identify whether a given height is bird dominated, mixed, or insect dominated. For the mixed case, improved wind estimates can be obtained from insect-only VAD. Otherwise, bird-only VADs can be obtained from bird-dominated heights, while insect-only VADs are obtained from insect-dominated heights. The former can be used to track birds while the latter tracks insects and the wind.
Unmanned aerial vehicle remotesensingimages suffer from problems such as arbitrary object orientation and dense arrangement of small targets, which makes horizontal box object detection difficult. To address these i...
详细信息
Computational topology has consequently shorten the time taken for imagerecognition with good accuracy and therefore has boosted the performance of computer vision. This paper uses computational topology in different...
详细信息
In recent years, the success of large-scale vision-language models (VLMs) such as CLIP has led to their increased usage in various computer vision tasks. These models enable zero-shot inference through carefully craft...
详细信息
In recent years, the research on remotesensingimage information output has developed rapidly, including not only remotesensing technology, imageprocessing, patternrecognition, etc., but also land use, environment...
详细信息
Perceptual image hashing is pivotal in various imageprocessing applications, including image authentication, content-based image retrieval, tampered image detection, and copyright protection. This paper proposes a no...
详细信息
暂无评论