The proceedings contain 114 papers. The topics discussed include: research on intelligent English-Chinese translation proofreading system based on gated feedback re current neural networks;intelligent recognition of s...
ISBN:
(纸本)9798350374407
The proceedings contain 114 papers. The topics discussed include: research on intelligent English-Chinese translation proofreading system based on gated feedback re current neural networks;intelligent recognition of structures in earth and rock dam images based on MASK-RCNN;a pupil diameter measurement system based on imageprocessing;image enhancement and deep learning in predicting the Gleason score of transcrectal ultrasound images of prostate cancer;research on video logo removal processing method based on MATLAB;analysis and evaluation of quality control throughout production of real scene 3D modeling based on oblique aerial photography;improving remotesensingimage classification through stochastic bilevel optimization;cross-domain image translation algorithm based on self-cross auto-encoder;a fast image mosaic algorithm based on feature partition extraction;and video compression and action recognition in self-supervised learning.
remotesensing technology plays an important role in many tasks such as natural disaster detection, weather and climate monitoring and military defense. Currently, remotesensingimageprocessing predominantly relies ...
详细信息
Target recognition in SAR images is a key issue in remotesensingimageprocessing, which is widely used in many fields. Traditional recognition methods face many challenges due to the complexity of SAR images. In thi...
详细信息
Structural characteristics representation and their fine variations are crucial for the recognition of different types of aircrafts in remotesensingimages. Aircraft type classification across different sensor remote...
详细信息
ISBN:
(纸本)9789897585630
Structural characteristics representation and their fine variations are crucial for the recognition of different types of aircrafts in remotesensingimages. Aircraft type classification across different sensor remotesensingimages by spectral and spatial resolutions of objects in an image involves variable length spatial pattern identification. In our proposed approach, we explore dynamic kernels to deal with variable length spatial patterns of aircrafts in remotesensingimages. A Gaussian mixture model (GMM), namely, structure model (SM) is trained over aircraft scenes to implicitly learn the local structures using the spatial scale-invariant feature transform (SIFT) features. The statistics of SM are used to design dynamic kernel, namely, mean interval kernel (MIK) to deal with the spatial changes globally in the identical scene and preserve the similarities in local spatial structures. The efficacy of the proposed method is demonstrated on the multi-type aircraft remotesensingimages (MTARSI) benchmark dataset (20 distinct kinds of aircraft) using MIK. Also, we compare the performance of the proposed approach with other dynamic kernels, such as supervector kernel (SVK) and intermediate matching kernel (IMK).
In order to explore the feasibility of applying UAV remotesensing and object-oriented canopy extraction technology to forest areas with different canopy densities, 26 forest plots located in Gannan Plateau were selec...
详细信息
The proceedings contain 177 papers. The topics discussed include: short-term power load forecasting model based on EEMD-SE-ERCNN;dynamic low-rank adaptation based pruning algorithm for large language models;dual-input...
ISBN:
(纸本)9798350350890
The proceedings contain 177 papers. The topics discussed include: short-term power load forecasting model based on EEMD-SE-ERCNN;dynamic low-rank adaptation based pruning algorithm for large language models;dual-input dual-output underwater image enhancement with branch interaction;object detection model of YOLOv8-CSD for UAV images;SE-ResNet: recognition of ECG anomalies based on convolutional neural network with fused attention;predicting accuracy for quantized neural networks: an attention-based approach with single-image;improved self-attention for Spodoptera frugiperda larval instar stages identification;an improved YOLOv8 algorithm for rotating object detection in unmanned aerial vehicle remotesensingimages;and a comprehensive data augmentation method for modern cigarette packaging defect detection.
The proceedings contain 59 papers. The topics discussed include: 3D point clouds simplification based on low-dimensional contour feature extraction;3D human pose estimation using pressure images on a smart chair;combi...
ISBN:
(纸本)9798400716607
The proceedings contain 59 papers. The topics discussed include: 3D point clouds simplification based on low-dimensional contour feature extraction;3D human pose estimation using pressure images on a smart chair;combining doses from internal and external radiotherapies for cervical cancer with successive image registration;attention mechanism-based feature fusion generative network for infrared-visible person re-identification;a vision-based remote assistance method and its application in object transfer;research on model-free 6D object pose estimation based on vision 3D matching;active exploration of modality complementarity for multimodal sentiment analysis;self-attention-based multi-scale feature fusion network for road ponding segmentation;and low light image enhancement algorithm based on edge and color information.
Most of the current capsule network methods have good classification effects mainly on simple content datasets such as MNIST and CIFAR10. However, for remotesensing scene images with complex objects, these methods ca...
详细信息
This paper aims at achieving fine-grained building attribute segmentation in a cross-view scenario, i.e., using satellite and street-view image pairs. The main challenge lies in overcoming the significant perspective ...
详细信息
ISBN:
(纸本)9798350353006
This paper aims at achieving fine-grained building attribute segmentation in a cross-view scenario, i.e., using satellite and street-view image pairs. The main challenge lies in overcoming the significant perspective differences between street views and satellite views. In this work, we introduce SG-BEV, a novel approach for satellite-guided BEV fusion for cross-view semantic segmentation. To overcome the limitations of existing cross-view projection methods in capturing the complete building facade features, we innovatively incorporate Bird's Eye View (BEV) method to establish a spatially explicit mapping of street-view features. Moreover, we fully leverage the advantages of multiple perspectives by introducing a novel satellite-guided reprojection module, optimizing the uneven feature distribution issues associated with traditional BEV methods. Our method demonstrates significant improvements on four cross-view datasets collected from multiple cities, including New York, San Francisco, and Boston. On average across these datasets, our method achieves an increase in mIOU by 10.13% and 5.21% compared with the state-of-the-art satellite-based and cross-view methods. The code and datasets of this work will be released at https: //***/yejy53/SG-BEV.
With the proliferation of a wide variety of sensors, accurate multi-source image registration is crucial for many remotesensingimageprocessing tasks. However, the registration of multi-source images faces the chall...
详细信息
With the proliferation of a wide variety of sensors, accurate multi-source image registration is crucial for many remotesensingimageprocessing tasks. However, the registration of multi-source images faces the challenges of rotations, scales, and domain transformations caused by significant differences in shooting time, viewing angle, and sensor imaging modes. To cope with this problem, we propose a deep learning-based registration method named TRFeat, which aims to comprehensively improve the rotation, scale, and cross-domain robustness of local features. First, we introduce a special circular sampling convolutional layer to replace the standard square convolutional layer, in order to enhance the rotational robustness of local features. Second, we design a scale pyramid backbone network architecture to improve the robustness of the network to scale transformations. Third, we promote the use of hypercolumn domain alignment loss to extract cross-domain robust local descriptors for images from different sources. In addition, we develop a novel keypoint detection training framework based on iterative refinement supervision to obtain repeatable and reliable keypoints localization in multi-source images. Finally, we conduct thorough experiments on five multi-source datasets. Extensive experimental results validate that our TRFeat outperforms other state-of-the-art hand-crafted (e.g. RIFT) and deep learning-based methods (e.g. ASLFeat). Specifically, our TRFeat achieves an MMA@3 of 76.08% on the HPatches dataset and an RMSE of 3.38 on the Xiang dataset. The code is available at https://***/vignywang/TRFeat.
暂无评论