检索结果-内蒙古大学图书馆

IEEE/CVF Conference on Computer Vision and pattern recognition (CVPR)

作者： Ye, Junyan Luo, Qiyan Yu, Jinhua Zhong, Huaping Zheng, Zhimeng He, Conghui Li, Weijia Sun Yat Sen Univ Guangzhou Guangdong Peoples R China Shanghai AI Lab Shanghai Peoples R China SenseTime Res Shanghai Peoples R China Zhejiang Univ Hangzhou Zhejiang Peoples R China

ISBN: (纸本)9798350353006

This paper aims at achieving fine-grained building attribute segmentation in a cross-view scenario, i.e., using satellite and street-view image pairs. The main challenge lies in overcoming the significant perspective differences between street views and satellite views. In this work, we introduce SG-BEV, a novel approach for satellite-guided BEV fusion for cross-view semantic segmentation. To overcome the limitations of existing cross-view projection methods in capturing the complete building facade features, we innovatively incorporate Bird's Eye View (BEV) method to establish a spatially explicit mapping of street-view features. Moreover, we fully leverage the advantages of multiple perspectives by introducing a novel satellite-guided reprojection module, optimizing the uneven feature distribution issues associated with traditional BEV methods. Our method demonstrates significant improvements on four cross-view datasets collected from multiple cities, including New York, San Francisco, and Boston. On average across these datasets, our method achieves an increase in mIOU by 10.13% and 5.21% compared with the state-of-the-art satellite-based and cross-view methods. The code and datasets of this work will be released at https: //***/yejy53/SG-BEV.

关键词： BEV Fusion Cross-View remote sensing Semantic Segmentation

来源：评论

学校读者我要写书评

暂无评论

Real-Time Person Identification by Video image Based on YOLOv2 and VGG 16 Networks

引用

AUTOMATION AND remote CONTROL 2022年第10期83卷 1567-1575页

作者： Bobkov, A. V. Aung, Kh. Bauman Moscow State Tech Univ Moscow 105005 Russia

This paper deals with the problem of video-based face recognition. Nowadays, facial recognition methods have made a big step forward, but video-based recognition with its poor quality, difficult lighting conditions, and real-time requirements is still a difficult and unfinished *** paper uses the apparatus of convolutional networks for various stages of processing: for capturing and detecting a face, for constructing a feature vector, and finally for recognition. All algorithms are implemented and studied in the Matlab environment to simplify their further export to embedded applications.

关键词： VGG16 convolutional neural network face recognition YOLOv2 object detection algorithm deep learning face database

来源：评论

学校读者我要写书评

暂无评论

remote sensing images Semantic Segmentation Method Based on Improved Nested UNet

Remote Sensing Images Semantic Segmentation Method Based on ...

引用

2022 International Conference on Geographic Information and remote sensing Technology, GIRST 2022

作者： Li, Zhongyu Liu, Yang Kuang, Yin Wang, Huajun Liu, Cheng College of Geophysics Chengdu University of Technology Chengdu610059 China College of Computer Science Chengdu Normal University Chengdu611130 China Key Laboratory of interior Layout optimization and Security Institutions of Higher Education of Sichuan Province Sichuan Chengdu611130 China Key Laboratory of Pattern Recognition and Intelligent Information Processing of Sichuan Chengdu University Chengdu610106 China Artificial Intelligence Key Laboratory of Sichuan Province Zigong643000 China

ISBN: (纸本)9781510662186

With the development of remote sensing technology, remote sensing images of buildings are of great significance in urban planning, disaster response, and other directions. When we use a neural network containing batch normalization layers for semantic segmentation, the neural network is sensitive to batch size and has low segmentation accuracy for occluded and dense buildings. This paper proposes a method for building segmentation in remote sensing images based on Nested UNet (UNet++) deep neural network. First, the UNet++ network is used to extract features, and the Group Normalization (GN) method is used instead of Batch Normalization (BN) to alleviate the model's sensitivity to batch size. Then the weighted combination of Cross-Entropy Loss (CELoss) and DiceLoss is used as the loss function to improve the feature extraction ability of the neural network for unbalanced buildings. Finally, experiments are carried out on the WHUBuilding dataset. The experimental results show that the improved model (UNet++-GN) improves Mean Intersection over Union (MIoU) and Mean Pixel Accuracy (Macc) by 12.16% and 2.92%, respectively, compared with the original model (UNet++-BN). © 2023 SPIE.

关键词： Feature extraction

来源：评论

学校读者我要写书评

暂无评论

An Improved YOLOv8 Algorithm for Rotating Object Detection in Unmanned Aerial Vehicle remote sensing images 7

An Improved YOLOv8 Algorithm for Rotating Object Detection i...

引用

7th International Conference on pattern recognition and Artificial Intelligence, PRAI 2024

作者： Song, Hongqing Pu, Juntao Zhang, Bin Chen, Yu Baima, Rongzhong Zhu, Keyan Ning, Jiajia Civil Aviation Logistics Technology Co. Ltd Chengdu China School of Information Engineering Southwest University of Science and Technology Mianyang China

ISBN: (纸本)9798350350890

Unmanned aerial vehicle remote sensing images suffer from problems such as arbitrary object orientation and dense arrangement of small targets, which makes horizontal box object detection difficult. To address these issues, an improved YOLOv8 based remote sensing image rotating object detection algorithm (R-YOLOv8) is proposed. Firstly, the long edge definition method of the five-parameter method is used to represent the rotation box, which allows the bounding box to cover the small target at any angle. This can better fit the actual shape and arrangement of the target, which effectively solves the problem of accidental deletion caused by inaccurate horizontal box positioning. Then, ProbIOU Loss is introduced as the regression loss function to convert the rotation box into a two-dimensional Gaussian distribution, which solves the problem of angle periodicity caused by the representation of rotating targets. Finally, the BiFormer attention mechanism is introduced to improve the accuracy of target detection while reducing parameters and computational complexity. This paper conducts relevant experiments on the UAV-ROD dataset. Compared to classical networks and existing oriented object detection networks, the R-YOLOv8 algorithm performs better in the case of a model size of only 2.8 MB, mAP@0.5 reaches 98.7%, which verifies the effectiveness and progressiveness of the proposed method. © 2024 IEEE.

关键词： Unmanned aerial vehicles (UAV)

来源：评论

学校读者我要写书评

暂无评论

Applying Computational Topology for Enhanced image recognition and Computer Vision 2

Applying Computational Topology for Enhanced Image Recogniti...

引用

2nd IEEE International Conference on Advances in Information Technology, ICAIT 2024

作者： Siddhan, Saravanan Moorthy, A. Christy, S. Sivasankar, C. Bhaskar, K. Vijaya Department of Computer Science and Engineering Vel Tech Rangarajan Dr. Sagunthala R&D Institute of Science and Technology Tamilnadu Chennai India Saveetha School of Engineering SIMATS Saveetha University Thandalam Tamil Nadu Chennai India Tamilnadu Chennai India

ISBN: (纸本)9798350383867

Computational topology has consequently shorten the time taken for image recognition with good accuracy and therefore has boosted the performance of computer vision. This paper uses computational topology in different domains like society, livestock, medicine, and remote-sensing. Two primary outcomes have been revealed in the experiment: tightly-spaced pictures, thanks to topology, have increased accuracy by up to 5.0 percent compared to conventional facilitation. Considering the topology-enhanced method edge processing time, the calculation time will increase slightly, but it will have a significant performance advantage. Additionally, the edge, texture, shape, and pattern detection are contributed to the overall classifier output, and the performed method reaches the rate of accuracy of entities detection up to the prior methods. Primarily, we develop and compare the two methods with each other on the basis of accuracy, computational cost, scalability, noise resistance. The grading results show that the improved technique influences picture identification and computer vision system, making the technologies more mature and precise to be used on the real-world tasks. © 2024 IEEE.

关键词： Machine vision

来源：评论

学校读者我要写书评

暂无评论

APPLeNet: Visual Attention Parameterized Prompt Learning for Few-Shot remote sensing image Generalization using CLIP

APPLeNet: Visual Attention Parameterized Prompt Learning for...

引用

2023 IEEE/CVF Conference on Computer Vision and pattern recognition Workshops, CVPRW 2023

作者： Singha, Mainak Jha, Ankit Solanki, Bhupendra Bose, Shirsha Banerjee, Biplab Indian Institute of Technology Bombay India Technical University of Munich Germany

ISBN: (纸本)9798350302493

In recent years, the success of large-scale vision-language models (VLMs) such as CLIP has led to their increased usage in various computer vision tasks. These models enable zero-shot inference through carefully crafted instructional text prompts without task-specific supervision. However, the potential of VLMs for generalization tasks in remote sensing (RS) has not been fully realized. To address this research gap, we propose a novel image-conditioned prompt learning strategy called the Visual Attention Parameterized Prompts Learning Network (APPLeNet). APPLeNet emphasizes the importance of multi-scale feature learning in RS scene classification and disentangles visual style and content primitives for domain generalization tasks. To achieve this, APPLeNet combines visual content features obtained from different layers of the vision encoder and style properties obtained from feature statistics of domain-specific batches. An attention-driven injection module is further introduced to generate visual tokens from this information. We also introduce an anti-correlation regularizer to ensure discrimination among the token embeddings, as this visual information is combined with the textual tokens. To validate APPLeNet, we curated four available RS benchmarks and introduced experimental protocols and datasets for three domain generalization tasks. Our results consistently outperform the relevant literature and code is available at https://***/mainaksingha01/APPLeNet © 2023 IEEE.

关键词： remote sensing

来源：评论

学校读者我要写书评

暂无评论

Realization Technology of Texture Feature Extraction Algorithm of remote sensing Satellite image Based on FPGA 2

Realization Technology of Texture Feature Extraction Algorit...

引用

2nd IEEE International Conference on Data Science and Computer Application, ICDSCA 2022

作者： Feng, Kai Zhang, Rui Shaanxi Academy of Aerospace Technology Application Co. Ltd. Xi'an China

ISBN: (数字)9781665472005

ISBN: (纸本)9781665472005

In recent years, the research on remote sensing image information output has developed rapidly, including not only remote sensing technology, image processing, pattern recognition, etc., but also land use, environmental monitoring, disaster forecasting, urban planning and other fields. The purpose of this paper is to study the realization technology of remote sensing satellite image texture feature extraction algorithm based on FPGA. A hardware system implementation scheme based on FPGA is proposed. Firstly, the overall structure of the system is established, and the software and hardware platforms are selected. Then, according to the difficulty of the algorithm, it is proposed to divide the algorithm into two parts, and the system based on hardware logic and based on soft core processor is described in detail. Implementation plan. An object modeling method based on Gabor texture blocks is introduced. This method is applied to the extraction of urban buildings from remote sensing images, and the final building extraction accuracy reaches 91.5%. © 2022 IEEE.

关键词： Field programmable gate arrays (FPGA)

来源：评论

学校读者我要写书评

暂无评论

Multi-level graph learning network for hyperspectral image classification

引用

pattern recognition 2022年 129卷

作者： Wan, Sheng Pan, Shirui Zhong, Shengwei Yang, Jie Yang, Jian Zhan, Yibing Gong, Chen Nanjing Univ Sci & Technol Sch Comp Sci & Engn PCA Lab Key Lab Intelligent Percept & Syst High Dimens Inf Nanjing 210094 Jiangsu Peoples R China Monash Univ Fac Informat Technol Clayton VIC 3800 Australia Shanghai Jiao Tong Univ Inst Image Proc & Pattern Recognit Shanghai 200240 Peoples R China JD Explore Acad Beijing 100176 Peoples R China Nanjing Univ Sci & Technol Sch Comp Sci & Engn Jiangsu Key Lab Image & Video Understanding Social Nanjing 210094 Jiangsu Peoples R China

Graph Convolutional Network (GCN) has emerged as a new technique for hyperspectral image (HSI) classification. However, in current GCN-based methods, the graphs are usually constructed with manual effort and thus is separate from the classification task, which could limit the representation power of GCN. Moreover, the employed graphs often fail to encode the global contextual information in HSI. Hence, we propose a Multi-level Graph Learning Network (MGLN) for HSI classification, where the graph structural information at both local and global levels can be learned in an end-to-end fashion. First, MGLN employs attention mechanism to adaptively characterize the spatial relevance among image regions. Then localized feature representations can be produced and further used to encode the global contextual information. Finally, prediction can be acquired with the help of both local and global contextual information. Experiments on three real-world hyperspectral datasets reveal the superiority of our MGLN when compared with the state-of-the-art methods. (c) 2022 Elsevier Ltd. All rights reserved.

关键词： Graph convolutional network Graph-based machine learning Hyperspectral image classification remote sensing Graph structural learning Graph convolutional network Graph-based machine learning Hyperspectral image classification remote sensing Graph structural learning

来源：评论

学校读者我要写书评

暂无评论

Poleward-Motion Aware Network for Poleward Moving Auroral Forms recognition

引用

IEEE GEOSCIENCE AND remote sensing LETTERS 2022年 19卷 1页

作者： Tang, Yiping Guo, Kaitai Wei, Chen Zheng, Yang Ren, Shenghan Liang, Jimin Xidian Univ Sch Elect Engn Xian 710071 Shaanxi Peoples R China Xidian Univ Sch Life Sci & Technol Xian 710071 Shaanxi Peoples R China

Poleward moving auroral forms (PMAFs) are a common dayside auroral phenomenon, and the study of PMAFs has important implications for the exploration of the near-earth space physical processes for geosciences. In the all-sky imager (ASI) image sequence, PMAFs show a tendency to move northward in the northern hemisphere. Therefore, this particular motion pattern can be used for PMAF recognition. Previous works for automatic recognition of PMAFs tend to rely on optical flow. However, both the traditional and the deep learning-based optical flow estimation methods are time- and memory-expensive. In view of the large number of auroral images generated every year, it is impractical to estimate the optical flow for all auroral data with limited computational resources. In this letter, a poleward-motion aware network (PA-Net) is proposed to extract the motion features directly from ASI images. PA-Net computes the correlation between each point in an image and the points at the poleward direction in the following image by means of a poleward-motion aware operation (PA-Operation), to verify whether the point under consideration has undergone poleward motion. In addition, a channel attention mechanism is applied to the features obtained by PA-Operation to suppress information less helpful for recognizing PMAFs. The PA-Net achieves the best performance on the PMAFs recognition dataset over other commonly used action recognition models, validating the superiority of our approach. More importantly, the complicated optical flow estimation is avoided, making it possible to apply the proposed method to large-scale auroral data.

关键词： Optical imaging Optical sensors Estimation Hidden Markov models Magnetosphere Ion radiation effects Geoscience and remote sensing Aurora deep learning poleward-motion aware network (PA-Net) poleward moving auroral forms (PMAFs)

来源：评论

学校读者我要写书评

暂无评论

Simulation of GPR B-Scan Data Based on Dense Generative Adversarial Network

引用

IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND remote sensing 2023年 16卷 3938-3944页

作者： Wang, Bin Chen, Peiyao Zhang, Gong Xidian Univ Sch Elect Engn Xian 710071 Peoples R China Xidian Univ Underground Space Res Inst CNACG Underground Space Technol Dev Co Ltd Xian 710071 Peoples R China

Urban subsurface infrastructures, e.g., pipelines and roads, are aging with the expansion of modern cities. Benefiting from the capability of nondestructive detection, ground penetrating radar (GPR) has been widely applied to underground objects or disasters detection, and GPR B-scan images are employed by manual interpretation. This way of high subjectivity and uncertainty inevitably results in failure of detection. Meanwhile, the shortage of labeled images greatly impedes the automatization and intelligentization of underground disaster detection based on GPR. Many data simulation techniques, e.g., forward modeling, were used to augment images for training;however, the generated forward images were not similar enough to the real B-scan data, which makes recognition a challenging task. To address this problem, we proposed a novel B-scan image simulation method based on a generative adversarial network to generate synthetic images for training detection networks. Our network utilizes DenseNet as the backbone network of the generator to extract image features, and a weighted total variation regularization term to regularize the loss function of the network. The comparison and ablation experiments verified that our network could generate simulation images with high similarity to real GPR B-scan images. We believe that this work contributes to the intelligent processing and analysis of GPR data and improves the efficiency of underground disaster detection.

关键词： Generative adversarial networks Feature extraction image edge detection Convolution Generators Buried object detection Urban areas Data augmentation generative adversarial network (GAN) ground penetrating radar (GPR) weighted total variation (w-TV)

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：