The proceedings contain 103 papers. The topics discussed include: compressed sampling of LFM signals based on fractional Fourier domain;atmospheric correction of high resolution remotesensingimages with automatic da...
ISBN:
(纸本)9798400707032
The proceedings contain 103 papers. The topics discussed include: compressed sampling of LFM signals based on fractional Fourier domain;atmospheric correction of high resolution remotesensingimages with automatic data acquisition by network;research on optimization of adaptive cache replacement algorithm strategy;the novel method about correlation calculation based on an inequation;research on the implementation of accurate delay algorithm based on DSP6000 series assembly language;an adaptive knowledge graph construction method for semi-structured data;research on the implementation and application of improved canny algorithm for image edge detection based on FPGA;research on computer-generated brand advertising design and optimization;a maximum or minimum way of forming 2-D histogram for image optimal multiple value segmentation;and enhancing unsupervised few-shot medical image classification with weight-enhanced contrastive learning.
In the field of remotesensing, panchromatic sharpening technology integrates spatial data from panchromatic images with spectral data from multispectral images to generate high-resolution multispectral *** mapping fr...
详细信息
We propose SAM-Road, an adaptation of the Segment Anything Model (SAM) [27] for extracting large-scale, vectorized road network graphs from satellite imagery. To predict graph geometry, we formulate it as a dense sema...
详细信息
ISBN:
(纸本)9798350365474
We propose SAM-Road, an adaptation of the Segment Anything Model (SAM) [27] for extracting large-scale, vectorized road network graphs from satellite imagery. To predict graph geometry, we formulate it as a dense semantic segmentation task, leveraging the inherent strengths of SAM. The image encoder of SAM is fine-tuned to produce probability masks for roads and intersections, from which the graph vertices are extracted via simple non-maximum suppression. To predict graph topology, we designed a lightweight transformer-based graph neural network, which leverages the SAM image embeddings to estimate the edge existence probabilities between vertices. Our approach directly predicts the graph vertices and edges for large regions without expensive and complex post-processing heuristics and is capable of building complete road network graphs spanning multiple square kilometers in a matter of seconds. With its simple, straightforward, and minimalist design, SAM-Road achieves comparable accuracy with the state-of-the-art method RNGDet++[57], while being 40 times faster on the City-scale dataset. We thus demonstrate the power of a foundational vision model when applied to a graph learning task. The code is available at https://***/htcr/sam_road.
In this paper we introduce the Temporo-Spatial Vision Transformer (TSViT), a fully-attentional model for general Satellite image Time Series (SITS) processing based on the Vision Transformer (ViT). TSViT splits a SITS...
详细信息
ISBN:
(纸本)9798350301298
In this paper we introduce the Temporo-Spatial Vision Transformer (TSViT), a fully-attentional model for general Satellite image Time Series (SITS) processing based on the Vision Transformer (ViT). TSViT splits a SITS record into non-overlapping patches in space and time which are tokenized and subsequently processed by a factorized temporo-spatial encoder. We argue, that in contrast to natural images, a temporal-then-spatial factorization is more intuitive for SITS processing and present experimental evidence for this claim. Additionally, we enhance the model's discriminative power by introducing two novel mechanisms for acquisition-time-specific temporal positional encodings and multiple learnable class tokens. The effect of all novel design choices is evaluated through an extensive ablation study. Our proposed architecture achieves state-of-the-art performance, surpassing previous approaches by a significant margin in three publicly available SITS semantic segmentation and classification datasets. All model, training and evaluation codes can be found at https://***/michaeltrs/DeepSatModels.
Hyperspectral image (HSI) change detection is a key research topic in the field of remotesensing. Existing HSI change detection methods often overlook the potential interactions among training samples. To address thi...
详细信息
ISBN:
(纸本)9789819784929;9789819784936
Hyperspectral image (HSI) change detection is a key research topic in the field of remotesensing. Existing HSI change detection methods often overlook the potential interactions among training samples. To address this issue, we develop a novel HSI change detection network, the Cross-Sample Slot attention-based Network (CSSNet). This network, building on the slot attention mechanism, can explicitly distinguish between changed and unchanged region representations and disentangle these representations by multiple independent concepts. These concepts are instrumental in capturing the uniformity and diversity in the representations among different samples during batch processing. Furthermore, we introduced a Dual Gated Feed-forward Network (DGFN) to effectively filter out redundant and irrelevant information. Experimental results on two different HSI datasets demonstrate that CSSNet outperforms several existing mainstream methods in performance.
As a popular task in remotesensing, road extraction has been widely concerned and applied by researchers, especially by using deep learning methods. However, many methods ignore the properties of roads in remote sens...
详细信息
As the remotesensingimage information rapidly becomes abundant, it is a challenge for the detection of tiny targets with dense distribution. Therefore, a multi-scale rotating object detection model based on the impr...
详细信息
Machine learning algorithms are capable of processingimage-based scenes, detecting and recognizing embedded targets. This has been demonstrated by data scientists and computer vision engineers, but performant algorit...
详细信息
ISBN:
(纸本)9781510673892;9781510673885
Machine learning algorithms are capable of processingimage-based scenes, detecting and recognizing embedded targets. This has been demonstrated by data scientists and computer vision engineers, but performant algorithms must be robustly trained to successfully complete such a complex task. This typically requires a large set of training data on which the algorithm can base statistical predictions. Electro-optical infrared (EO/IR) remotesensing applications necessitate a substantial image database with suitable variation for adept learning to occur. For human detection/recognition applications diversity in clothing ensembles, pose, season, times of day, sensor platform perspectives, scene backgrounds and weather conditions can be included in training image sets to ensure sufficient input variety. However, acquiring such a diverse image set from measured sources can be a challenge, especially in thermal infrared wavebands (e.g., MWIR and LWIR). Alternatively, generating synthetic imagery with appropriate features is possible and has been shown to perform well, but a careful methodology must be followed if robust training is to be accomplished. In this work, MuSES and CoTherm are used to generate synthetic EO/IR remotesensingimagery of various human dismounts with a range of clothing, poses and environmental factors. The performance of a YOLO ("you only look once") deep learning algorithm is studied, and sensitivity conclusions are discussed.
This article takes ancient architecture in Shaanxi Province as the research object. Based on laser 3D scanning technology, the seamless integration of point cloud data and GIS data of individual buildings is completed...
详细信息
ISBN:
(纸本)9798400707032
This article takes ancient architecture in Shaanxi Province as the research object. Based on laser 3D scanning technology, the seamless integration of point cloud data and GIS data of individual buildings is completed through unmanned aerial vehicles and RTK, forming a process of spatial data acquisition based on ancient architecture. The key technologies include oblique photography mesh model acquisition, laser point cloud model acquisition, and parameterized 3D model generation. By using dynamic uniform light and color processing to tilt aerial photography images, the color effect of the images is improved, and a high-precision Mesh model is generated using the ContextCapture system. At the same time, scientific and reasonable denoising methods are adopted to process laser point cloud data, ensuring the quality of the point cloud. In model generation, texture compression algorithm is applied to optimize OBJ data, improve texture mapping utilization and model loading efficiency. In the end, the fusion of multi-source data successfully produced real-life 3D models of five ancient buildings, including the Big Wild Goose Pagoda and Bell Tower, providing technical support for the digital protection and dissemination of cultural heritage.
The proceedings contain 140 papers. The topics discussed include: digital multi-scale visual planning model of spatial-geographical landscape pattern of smart parks;visual question answering model based on fusing glob...
ISBN:
(纸本)9781510667563
The proceedings contain 140 papers. The topics discussed include: digital multi-scale visual planning model of spatial-geographical landscape pattern of smart parks;visual question answering model based on fusing global-local feature;imageprocessing of the special sensor microwave/imager based on passive microwave remotesensing;imageprocessing of the special sensor microwave/imager based on passive microwave remotesensing;graptolite image classification based on feature transfer and mixup data enhancement;an image classification method based on few-shot learning;fine-grained imagerecognition based on multi-branch and multi-scale learning;research on road extraction model of remotesensingimage based on the fused convolutional module and attention mechanism;unsupervised aircraft detection in SAR images with image-level domain adaption from optical images;the role of echocardiography segmentation evaluation metrics in clinical diagnosis;and machine vision-based measurement of air compressor crankshaft journal dimensions.
暂无评论