Studies addressing the supervised extraction of geospatial elements from aerial imagery with semanticsegmentation operations (including road surface areas) commonly feature tile sizes varying from 256 x 256 pixels to...
详细信息
Studies addressing the supervised extraction of geospatial elements from aerial imagery with semanticsegmentation operations (including road surface areas) commonly feature tile sizes varying from 256 x 256 pixels to 1024 x 1024 pixels with no overlap. Relevant geo-computing works in the field often comment on prediction errors that could be attributed to the effect of tile size (number of pixels or the amount of information in the processed image) or to the overlap levels between adjacent image tiles (caused by the absence of continuity information near the borders). This study provides further insights into the impact of tile overlaps and tile sizes on the performance of deep learning (DL) models trained for road extraction. In this work, three semanticsegmentation architectures were trained on data from the SROADEX dataset (orthoimages and their binary road masks) that contains approximately 700 million pixels of the positive "Road" class for the road surface area extraction task. First, a statistical analysis is conducted on the performance metrics achieved on unseen testing data featuring around 18 million pixels of the positive class. The goal of this analysis was to study the difference in mean performance and the main and interaction effects of the fixed factors on the dependent variables. The statistical tests proved that the impact on performance was significant for the main effects and for the two-way interaction between tile size and tile overlap and between tile size and DL architecture, at a level of significance of 0.05. We provide further insights and trends in the predictions of the extensive qualitative analysis carried out with the predictions of the best models at each tile size. The results indicate that training the DL models on larger tile sizes with a small percentage of overlap delivers better road representations and that testing different combinations of model and tile sizes can help achieve a better extraction performance.
Wildfires are common disasters that have long-lasting climate effects and serious ecological, social, and economic effects due to climate change. Since Earth observation (EO) satellites were launched into space, remot...
详细信息
Wildfires are common disasters that have long-lasting climate effects and serious ecological, social, and economic effects due to climate change. Since Earth observation (EO) satellites were launched into space, remote sensing (RS) has become a more efficient technique that can be used in agriculture, environmental protection, geological exploration, and wildfires. The increasing number of EO satellites orbiting the earth provides huge amounts of data, such as Sentinel-2 with its Multi Spectral Instrument (MSI) sensor. Using uni-temporal Sentinel-2 imagery, we proposed a workflow based on deep learning (DL) semantic segmentation models to detect wildfires. In particular, we created a new big wildfire dataset suitable for semantic segmentation models. We tested our dataset using DL models such as U-Net, LinkNet, DeepLabV3+, U-Net++, and Attention ResU-Net. The results are analysed and compared in terms of the F1 score, the intersection over union (IoU) score, the precision and recall metrics, and the amount of training time each model possesses. The best results were achieved using U-Net with the ResNet50 encoder, with F1-score of 98.78% and IoU of 97.38%, and we developed it into a pre-trained DL Package (DLPK) model that is able to detect and monitor the wildfire from Sentinel-2 images automatically.
An accurate land-cover segmentation of very-high-resolution aerial images is essential for a wide range of applications, including urban planning and natural resource management. However, the automation of this proces...
详细信息
An accurate land-cover segmentation of very-high-resolution aerial images is essential for a wide range of applications, including urban planning and natural resource management. However, the automation of this process remains a challenge owing to the complexity of images, variability in land surface features, and noise. In this study, a method for training convolutional neural networks and transformers to perform land-cover segmentation on very-high-resolution aerial images in a regional context was proposed. We assessed the U-Net-scSE, FT-U-NetFormer, and DC-Swin architectures, incorporating transfer learning and active contour loss functions to improve performance on semanticsegmentation tasks. Our experiments conducted using the OpenEarthMap dataset, which includes images from 44 countries, demonstrate the superior performance of U-Net-scSE models with the EfficientNet-V2-XL and MiT-B4 encoders, achieving an mIoU of over 0.80 on a test dataset of urban and rural images from Peru.
As the core of the Transformer, the attention mechanism is crucial in model design. However, the performance of attention modules varies across different datasets. Additionally, hyperparameter settings significantly i...
详细信息
As the core of the Transformer, the attention mechanism is crucial in model design. However, the performance of attention modules varies across different datasets. Additionally, hyperparameter settings significantly impact the performance of attention modules, complicating the selection of the appropriate module. To fill the current research gap regarding the performance of attention modules in crack recognition, this study investigates five highly cited attention modules across three datasets with different styles. By setting nine combinations of learning rates and batch sizes, we conducted 135 comparative experiments. Precision, Recall, F1-score, and mIoU were used as evaluation metrics to analyze the recognition accuracy and loss curve convergence of each attention module. Parameters, frames per second (FPS), and floating-point operations per second (FLOPs) were used to compare the computational efficiency of each module. The results indicate that channel attention (CA), bottleneck attention module (BAM), and convolutional block attention module (CBAM) outperform dual attention (DA) and self-attention (SA) in crack recognition and resistance to interference. The modules achieve good convergence with learning rate and batch size combinations of 1 x 10-4/4, 1 x 10-4/8, and 1 x 10-4/16. We recommend using 1 x 10-4/4 as the initial hyperparameter setting in future work. Although DA and SA have higher parameter counts compared to CA, BAM, and CBAM, the FPS and FLOPs values of each module show minimal differences when the batch size is the same.
The development of post-industrial landscapes at industrial sites plays an important role to fill urban green spaces. However, current research on the use and redevelopment of post-industrial sites has mainly focused ...
详细信息
The development of post-industrial landscapes at industrial sites plays an important role to fill urban green spaces. However, current research on the use and redevelopment of post-industrial sites has mainly focused ecological restoration, and studies combined with objective and subjective data to quantify public preferences remain poorly understood. In this study, deep learning was used to semantically segment the post-industrial landscape, and a multiple stepwise regression model was used to analyze the non-linear correlation between quantitative indicators and public "restorative-repressive" perception, and structural equation model (SEM) between quantitative indicators and public perception data were established. We investigated and found semantic segmentation models for machine learning combined with principal component analysis (PCA) and non-metric multidimensional scaling (NMDS) analysis can categorize post-industrial parks into two groups dominated by artificial elements and natural elements. (2) Public perceptions varied more in the natural element dominated group and less in the industrial element-dominated group. In addition, waterbody in the post-industrial landscape existed as a destabilizing factor. (3) There was a difference in the correlation between quantitative indicators and subjective perceptions in the two categories of parks. (4) Height of industrial building (HIB), function of industrial building(FIB), vegetation succession(VS) were significantly influenced public satisfaction. These findings informed that public satisfaction with post-industrial landscapes can be enhanced taking full account of the different uses of natural and artificial elements and enabling researchers to analyze the redevelopment of post-industrial landscapes from a new perspective of evidence-based design.
Deep learning–based methods have become alternatives to traditional numerical weather prediction systems, offering faster computation and the ability to utilize large historical datasets. However, the application of ...
详细信息
Deep learning–based methods have become alternatives to traditional numerical weather prediction systems, offering faster computation and the ability to utilize large historical datasets. However, the application of deep learning to medium-range regional weather forecasting with limited data remains a significant challenge. In this work, we propose three key solutions: (1) motivated by the need to improve model performance in data-scarce regional forecasting scenarios, we innovatively apply semantic segmentation models, to better capture spatiotemporal features and improve prediction accuracy; (2) recognizing the challenge of overfitting and the inability of traditional noise-based data augmentation methods to effectively enhance model robustness, we introduce a novel learnable Gaussian noise mechanism that allows the model to adaptively optimize perturbations for different locations, ensuring more effective learning; and (3) to address the issue of error accumulation in autoregressive prediction, as well as the challenge of learning difficulty and the lack of intermediate data utilization in one-shot prediction, we propose a cascade prediction approach that effectively resolves these problems while significantly improving model forecasting performance. Our method achieves a competitive result in The East China Regional AI Medium Range Weather Forecasting Competition . Ablation experiments further validate the effectiveness of each component, highlighting their contributions to enhancing prediction performance. 深度学习逐渐替代传统数值天气预报 (NWP) 系统, 但在数据有限的中期天气预报中仍面临挑战.为此, 本文提出三项创新:首先, 引入语义分割模型增强时空特征捕捉能力, 提高预测精度;其次, 设计可学习的高斯噪声机制, 解决过拟合问题并突破传统噪声增强的局限性;最后, 提出级联预测方法, 平衡预测精度与误差控制, 缓解自回归预测的误差累积问题.该方法在华东区域AI中期气象预报竞赛中表现优异, 实验验证了各模块的有效性, 其中语义分割降低温度预测误差9.3%, 噪声机制提升降水预测F1-score 6.8%, 级联策略减少风速预测均方误差12.5%.此研究为数据受限的区域气象预报提供了新路径.
暂无评论