Remote sensing image semantic segmentation is an essential aspect of the intelligent analysis of remote sensing, extensively applied in urban planning, economic assessment, and disaster monitoring. However, the expans...
详细信息
ISBN:
(数字)9798350359312
ISBN:
(纸本)9798350359329
Remote sensing image semantic segmentation is an essential aspect of the intelligent analysis of remote sensing, extensively applied in urban planning, economic assessment, and disaster monitoring. However, the expansive field of view and intricate backgrounds in remote sensing images cause numerous objects of varying sizes and categories to coexist. This results in incomplete object segmentation and poses challenges in restoring the spatial distribution of objects. In this paper, we present a Multilevel Object Aware Network (MOA-Net), which addresses the challenges of remote sensing image semantic segmentation. To be specific, this method is designed from three perspectives. Firstly, we establish a Progressive Multiscale Global-Local Decoder (PMGLD), integrating global-local context information of objects at varying scales through a progressive convolution strategy. Secondly, the Orientation-Aware Attention Mechanism (OAAM) provides orientation information and guides the restoration of interclass 2-D spatial relationships. Finally, obtaining fine-grained features through CNN Stem improves edge segmentation. Experimental outcomes on the Potsdam and Vaihingen datasets indicate that our method of performance and efficiency surpass those of existing methods.
Remote sensing image semantic segmentation has widespread applications in urban planning and land monitoring. In recent years, U-Net and its variant networks have almost dominated the research in the field of semantic...
详细信息
ISBN:
(数字)9798350359312
ISBN:
(纸本)9798350359329
Remote sensing image semantic segmentation has widespread applications in urban planning and land monitoring. In recent years, U-Net and its variant networks have almost dominated the research in the field of semantic segmentation. However, many models pay less attention to computational efficiency, rendering them ineffective in scenarios with computational resource and timeliness constraints, such as autonomous driving and disaster monitoring. To address this issue, we propose the USA-Net (UNet-like with Shifted Axial), a lightweight hybrid model based on convolution and MLP (Multi-Layer Perceptron). Specifically, we design the ST Block (Shift Tokenized Block), which introduces local features into global operations in MLP through spatial shift, and then use ELCM (Efficient Large-kernel Convolution Module) to enlarge the model’s receptive field and learn the shape features of objects. Additionally, we propose a new semi-supervised learning framework to further improve the model’s generalization performance. On the ISPRS Vaihingen and ISPRS Potsdam datasets, USA-Net significantly outperforms most state-of-the-art methods in terms of segmentation accuracy and efficiency.
Click-based interactive image segmentation intends to segment an object from the background under user click guidance. Recently, Vision Transformer has made significant strides in interactive image segmentation. Howev...
详细信息
ISBN:
(数字)9798350359312
ISBN:
(纸本)9798350359329
Click-based interactive image segmentation intends to segment an object from the background under user click guidance. Recently, Vision Transformer has made significant strides in interactive image segmentation. However, the previous studies 1) overlook the importance of different clicks in terms of their contribution to the segmentation results; and 2) suffer from inconsistency across different feature scales in the multi-scale structure. In this paper, we propose a new interactive segmentation framework, named RCFormer, with two novel components: reconstruct click patch embedding (RCPE) for encoding the importance of clicks, and multi-scale adaptive fusion (MSAF) for the adaptive fusion of feature maps across different scales. RCPE enhances the effectiveness of click interactions by spatially distinguishing the importance of clicks. MSAF adaptively fuses useful spatial information and filters the redundant feature at multi-scales. The experiments on several benchmarks show that our proposed approach achieves state-of-the-art performance. Notably, our method achieves 2.31 NoC@90 on the Berkeley dataset, improving by 8.6% over the previous best results.
Recent advances in 3D point cloud analysis bring a diverse set of network architectures to the field. However, the lack of a unified framework to interpret those networks makes any systematic comparison, contrast, or ...
Recent advances in 3D point cloud analysis bring a diverse set of network architectures to the field. However, the lack of a unified framework to interpret those networks makes any systematic comparison, contrast, or analysis challenging, and practically limits healthy development of the field. In this paper, we take the initiative to explore and propose a unified framework called PointMeta, to which the popular 3D point cloud analysis approaches could fit. This brings three benefits. First, it allows us to compare different approaches in a fair manner, and use quick experiments to verify any empirical observations or assumptions summarized from the comparison. Second, the big picture brought by PointMeta enables us to think across different components, and revisit common beliefs and key design decisions made by the popular approaches. Third, based on the learnings from the previous two analyses, by doing simple tweaks on the existing approaches, we are able to derive a basic building block, termed PointMetaBase. It shows very strong performance in efficiency and effectiveness through extensive experiments on challenging benchmarks, and thus verifies the necessity and benefits of high-level interpretation, contrast, and comparison like PointMeta. In particular, PointMetaBase surpasses the previous state-of-the-art method by 0.7%/1.4/%2.1% mIoU with only 2%/11%/13% of the computation cost on the S3DIS datasets. The code and models are available at https://***/linhaojia13/PointMetaBase.
暂无评论