检索结果-内蒙古大学图书馆

7th Chinese Conference on pattern recognition and Computer Vision

作者： Dong, Shuai Huang, Shaoguang Zhang, Jinhan Zhang, Hongyan China Univ Geosci Sch Comp Sci Wuhan Peoples R China

ISBN: (纸本)9789819784929;9789819784936

Fusion of low-resolution hyperspectral image (LR-HSI) and high-resolution multispectral image (HR-MSI) has become an effective technique for HSI super-resolution. Deep leraning based fusion methods have achieved significant success in the fields. However, they often show limited ability to capture the complex spatial and spectral information of HSI, resulting in the loss of details. In this paper, we develope a novel multi-scale feature fusion based network (MSFNet) for HSI super-resolution, which consists of a multi-scale feature extraction block and a multi-scale feature fusion block. In the former block, we fully consider the spatial and spectral correlations and develop two modules, i.e., global-local attention and channel self-attention, to capture the complex structure of HSI at different scales. In the fusion stage, we adopt a U-Net like architecture to gradually fuse the extracted multi-scale features, resulting in restored HSIs at different scales. We also develop a new loss function to train the proposed neural network by minmizing the restoration errors at different scales both in the raw domain and the frequency domain, which facilitates to preserve the high-frequency details. Our experimental results demonstrate that the proposed model outperforms the state-of-the-art.

关键词： Hyperspectral remote sensing Multispectral Deep learning Fusion

来源：评论

学校读者我要写书评

暂无评论

SG-BEV: Satellite-Guided BEV Fusion for Cross-View Semantic Segmentation

SG-BEV: Satellite-Guided BEV Fusion for Cross-View Semantic ...

引用

IEEE/CVF Conference on Computer Vision and pattern recognition (CVPR)

作者： Ye, Junyan Luo, Qiyan Yu, Jinhua Zhong, Huaping Zheng, Zhimeng He, Conghui Li, Weijia Sun Yat Sen Univ Guangzhou Guangdong Peoples R China Shanghai AI Lab Shanghai Peoples R China SenseTime Res Shanghai Peoples R China Zhejiang Univ Hangzhou Zhejiang Peoples R China

ISBN: (纸本)9798350353006

This paper aims at achieving fine-grained building attribute segmentation in a cross-view scenario, i.e., using satellite and street-view image pairs. The main challenge lies in overcoming the significant perspective differences between street views and satellite views. In this work, we introduce SG-BEV, a novel approach for satellite-guided BEV fusion for cross-view semantic segmentation. To overcome the limitations of existing cross-view projection methods in capturing the complete building facade features, we innovatively incorporate Bird's Eye View (BEV) method to establish a spatially explicit mapping of street-view features. Moreover, we fully leverage the advantages of multiple perspectives by introducing a novel satellite-guided reprojection module, optimizing the uneven feature distribution issues associated with traditional BEV methods. Our method demonstrates significant improvements on four cross-view datasets collected from multiple cities, including New York, San Francisco, and Boston. On average across these datasets, our method achieves an increase in mIOU by 10.13% and 5.21% compared with the state-of-the-art satellite-based and cross-view methods. The code and datasets of this work will be released at https: //***/yejy53/SG-BEV.

关键词： BEV Fusion Cross-View remote sensing Semantic Segmentation

来源：评论

学校读者我要写书评

暂无评论

Real-Time Person Identification by Video image Based on YOLOv2 and VGG 16 Networks

引用

AUTOMATION AND remote CONTROL 2022年第10期83卷 1567-1575页

作者： Bobkov, A. V. Aung, Kh. Bauman Moscow State Tech Univ Moscow 105005 Russia

This paper deals with the problem of video-based face recognition. Nowadays, facial recognition methods have made a big step forward, but video-based recognition with its poor quality, difficult lighting conditions, and real-time requirements is still a difficult and unfinished *** paper uses the apparatus of convolutional networks for various stages of processing: for capturing and detecting a face, for constructing a feature vector, and finally for recognition. All algorithms are implemented and studied in the Matlab environment to simplify their further export to embedded applications.

关键词： VGG16 convolutional neural network face recognition YOLOv2 object detection algorithm deep learning face database

来源：评论

学校读者我要写书评

暂无评论

remote sensing images Semantic Segmentation Method Based on Improved Nested UNet

Remote Sensing Images Semantic Segmentation Method Based on ...

引用

2022 International Conference on Geographic Information and remote sensing Technology, GIRST 2022

作者： Li, Zhongyu Liu, Yang Kuang, Yin Wang, Huajun Liu, Cheng College of Geophysics Chengdu University of Technology Chengdu610059 China College of Computer Science Chengdu Normal University Chengdu611130 China Key Laboratory of interior Layout optimization and Security Institutions of Higher Education of Sichuan Province Sichuan Chengdu611130 China Key Laboratory of Pattern Recognition and Intelligent Information Processing of Sichuan Chengdu University Chengdu610106 China Artificial Intelligence Key Laboratory of Sichuan Province Zigong643000 China

ISBN: (纸本)9781510662186

With the development of remote sensing technology, remote sensing images of buildings are of great significance in urban planning, disaster response, and other directions. When we use a neural network containing batch normalization layers for semantic segmentation, the neural network is sensitive to batch size and has low segmentation accuracy for occluded and dense buildings. This paper proposes a method for building segmentation in remote sensing images based on Nested UNet (UNet++) deep neural network. First, the UNet++ network is used to extract features, and the Group Normalization (GN) method is used instead of Batch Normalization (BN) to alleviate the model's sensitivity to batch size. Then the weighted combination of Cross-Entropy Loss (CELoss) and DiceLoss is used as the loss function to improve the feature extraction ability of the neural network for unbalanced buildings. Finally, experiments are carried out on the WHUBuilding dataset. The experimental results show that the improved model (UNet++-GN) improves Mean Intersection over Union (MIoU) and Mean Pixel Accuracy (Macc) by 12.16% and 2.92%, respectively, compared with the original model (UNet++-BN). © 2023 SPIE.

关键词： Feature extraction

来源：评论

学校读者我要写书评

暂无评论

Intelligent Velocity Azimuth Display (IVAD) Wind Estimation in Clear-Air With S-Band Polarimetric Weather Radars

引用

IEEE TRANSACTIONS ON GEOSCIENCE AND remote sensing 2025年 63卷

作者： Jatau, Precious Yu, Tian-You Melnikov, Valery Kelly, Jeffrey Univ Oklahoma Adv Radar Res Ctr ARRC Norman OK 73019 USA Univ Oklahoma Sch Elect & Comp Engn Norman OK 73019 USA Univ Oklahoma Cooperat Inst Severe & High Impact Weather Res & O Natl Inst Risk & Resilience Norman OK 73072 USA Univ Oklahoma Sch Biol Sci Norman OK 73019 USA

Wind velocities approximated via velocity azimuth display (VAD) have been found to be contaminated by birds and enabled by insects. However, the widely used VAD wind profile (VWP) does not account for taxa, largely due to the challenge of distinguishing bird from insect echoes. This problem has been addressed by the recently developed bird-insect ridge classifier (BIRC). Hence, this work proposes an intelligent VAD (IVAD) that leverages BIRC to improve clear-air wind estimates by generating three new products, including the insects-birds ratio, bird-only VAD, and insect-only VAD. These products are analyzed for one-month periods containing nocturnal and diurnal bird migration. Wind bias is used as the evaluation metric, defined as the deviation of the predicted VAD from reference wind measurements obtained from the rapid refresh (RAP) model. Results show an inverse relationship between biases and the insects-birds ratio, such that increasing (decreasing) bird (insect) population was accompanied by larger biases. Furthermore, contaminated VADs showed improvements when insects-only signals were used instead of all biological echoes. We recommend that these products can be incorporated into the VWP. First, the insects-birds ratio can be used to identify whether a given height is bird dominated, mixed, or insect dominated. For the mixed case, improved wind estimates can be obtained from insect-only VAD. Otherwise, bird-only VADs can be obtained from bird-dominated heights, while insect-only VADs are obtained from insect-dominated heights. The former can be used to track birds while the latter tracks insects and the wind.

关键词： Birds Wind Insects Meteorology Biology Logic gates Doppler effect Azimuth Meteorological radar Wind speed Artificial intelligence classification data quality control data science machine learning model visualization and interpretation pattern recognition remote sensing velocity azimuth display weather radar signal processing

来源：评论

学校读者我要写书评

暂无评论

An Improved YOLOv8 Algorithm for Rotating Object Detection in Unmanned Aerial Vehicle remote sensing images 7

An Improved YOLOv8 Algorithm for Rotating Object Detection i...

引用

7th International Conference on pattern recognition and Artificial Intelligence, PRAI 2024

作者： Song, Hongqing Pu, Juntao Zhang, Bin Chen, Yu Baima, Rongzhong Zhu, Keyan Ning, Jiajia Civil Aviation Logistics Technology Co. Ltd Chengdu China School of Information Engineering Southwest University of Science and Technology Mianyang China

ISBN: (纸本)9798350350890

Unmanned aerial vehicle remote sensing images suffer from problems such as arbitrary object orientation and dense arrangement of small targets, which makes horizontal box object detection difficult. To address these issues, an improved YOLOv8 based remote sensing image rotating object detection algorithm (R-YOLOv8) is proposed. Firstly, the long edge definition method of the five-parameter method is used to represent the rotation box, which allows the bounding box to cover the small target at any angle. This can better fit the actual shape and arrangement of the target, which effectively solves the problem of accidental deletion caused by inaccurate horizontal box positioning. Then, ProbIOU Loss is introduced as the regression loss function to convert the rotation box into a two-dimensional Gaussian distribution, which solves the problem of angle periodicity caused by the representation of rotating targets. Finally, the BiFormer attention mechanism is introduced to improve the accuracy of target detection while reducing parameters and computational complexity. This paper conducts relevant experiments on the UAV-ROD dataset. Compared to classical networks and existing oriented object detection networks, the R-YOLOv8 algorithm performs better in the case of a model size of only 2.8 MB, mAP@0.5 reaches 98.7%, which verifies the effectiveness and progressiveness of the proposed method. © 2024 IEEE.

关键词： Unmanned aerial vehicles (UAV)

来源：评论

学校读者我要写书评

暂无评论

Applying Computational Topology for Enhanced image recognition and Computer Vision 2

Applying Computational Topology for Enhanced Image Recogniti...

引用

2nd IEEE International Conference on Advances in Information Technology, ICAIT 2024

作者： Siddhan, Saravanan Moorthy, A. Christy, S. Sivasankar, C. Bhaskar, K. Vijaya Department of Computer Science and Engineering Vel Tech Rangarajan Dr. Sagunthala R&D Institute of Science and Technology Tamilnadu Chennai India Saveetha School of Engineering SIMATS Saveetha University Thandalam Tamil Nadu Chennai India Tamilnadu Chennai India

ISBN: (纸本)9798350383867

Computational topology has consequently shorten the time taken for image recognition with good accuracy and therefore has boosted the performance of computer vision. This paper uses computational topology in different domains like society, livestock, medicine, and remote-sensing. Two primary outcomes have been revealed in the experiment: tightly-spaced pictures, thanks to topology, have increased accuracy by up to 5.0 percent compared to conventional facilitation. Considering the topology-enhanced method edge processing time, the calculation time will increase slightly, but it will have a significant performance advantage. Additionally, the edge, texture, shape, and pattern detection are contributed to the overall classifier output, and the performed method reaches the rate of accuracy of entities detection up to the prior methods. Primarily, we develop and compare the two methods with each other on the basis of accuracy, computational cost, scalability, noise resistance. The grading results show that the improved technique influences picture identification and computer vision system, making the technologies more mature and precise to be used on the real-world tasks. © 2024 IEEE.

关键词： Machine vision

来源：评论

学校读者我要写书评

暂无评论

APPLeNet: Visual Attention Parameterized Prompt Learning for Few-Shot remote sensing image Generalization using CLIP

APPLeNet: Visual Attention Parameterized Prompt Learning for...

引用

2023 IEEE/CVF Conference on Computer Vision and pattern recognition Workshops, CVPRW 2023

作者： Singha, Mainak Jha, Ankit Solanki, Bhupendra Bose, Shirsha Banerjee, Biplab Indian Institute of Technology Bombay India Technical University of Munich Germany

ISBN: (纸本)9798350302493

In recent years, the success of large-scale vision-language models (VLMs) such as CLIP has led to their increased usage in various computer vision tasks. These models enable zero-shot inference through carefully crafted instructional text prompts without task-specific supervision. However, the potential of VLMs for generalization tasks in remote sensing (RS) has not been fully realized. To address this research gap, we propose a novel image-conditioned prompt learning strategy called the Visual Attention Parameterized Prompts Learning Network (APPLeNet). APPLeNet emphasizes the importance of multi-scale feature learning in RS scene classification and disentangles visual style and content primitives for domain generalization tasks. To achieve this, APPLeNet combines visual content features obtained from different layers of the vision encoder and style properties obtained from feature statistics of domain-specific batches. An attention-driven injection module is further introduced to generate visual tokens from this information. We also introduce an anti-correlation regularizer to ensure discrimination among the token embeddings, as this visual information is combined with the textual tokens. To validate APPLeNet, we curated four available RS benchmarks and introduced experimental protocols and datasets for three domain generalization tasks. Our results consistently outperform the relevant literature and code is available at https://***/mainaksingha01/APPLeNet © 2023 IEEE.

关键词： remote sensing

来源：评论

学校读者我要写书评

暂无评论

Realization Technology of Texture Feature Extraction Algorithm of remote sensing Satellite image Based on FPGA 2

Realization Technology of Texture Feature Extraction Algorit...

引用

2nd IEEE International Conference on Data Science and Computer Application, ICDSCA 2022

作者： Feng, Kai Zhang, Rui Shaanxi Academy of Aerospace Technology Application Co. Ltd. Xi'an China

ISBN: (数字)9781665472005

ISBN: (纸本)9781665472005

In recent years, the research on remote sensing image information output has developed rapidly, including not only remote sensing technology, image processing, pattern recognition, etc., but also land use, environmental monitoring, disaster forecasting, urban planning and other fields. The purpose of this paper is to study the realization technology of remote sensing satellite image texture feature extraction algorithm based on FPGA. A hardware system implementation scheme based on FPGA is proposed. Firstly, the overall structure of the system is established, and the software and hardware platforms are selected. Then, according to the difficulty of the algorithm, it is proposed to divide the algorithm into two parts, and the system based on hardware logic and based on soft core processor is described in detail. Implementation plan. An object modeling method based on Gabor texture blocks is introduced. This method is applied to the extraction of urban buildings from remote sensing images, and the final building extraction accuracy reaches 91.5%. © 2022 IEEE.

关键词： Field programmable gate arrays (FPGA)

来源：评论

学校读者我要写书评

暂无评论

FG-PIH: A Fusion of Fresnelet Transform and Gradient Directional pattern for Perceptual image Hashing 2

FG-PIH: A Fusion of Fresnelet Transform and Gradient Directi...

引用

2nd International Conference on Recent Trends in Microelectronics, Automation, Computing, and Communications Systems, ICMACC 2024

作者： Meesala, Pavani Thounaojam, Dalton Meitei Computer Science and Engineering National Institute of Technology Silchar Computer Vision Laboratory Silchar India

ISBN: (纸本)9798350366570

Perceptual image hashing is pivotal in various image processing applications, including image authentication, content-based image retrieval, tampered image detection, and copyright protection. This paper proposes a novel approach for perceptual image hashing by combining the Fresnelet Transform with Gradient Directional patterns. Using the FG-PIH technique, the proposed method achieves superior robustness against common image processing attacks while maintaining perceptual similarity for near-duplicate images. Experimental results on standard benchmark datasets demonstrate the effectiveness and efficiency of the proposed Fresnelet Transform-based perceptual image hashing scheme. Furthermore, comparative analysis against state-of-the-art methods underscores the competitiveness of our approach in terms of hash quality and computational complexity. © 2024 IEEE.

关键词： Hamming distance

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：