版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Hunan Normal Univ Sch Informat Sci & Engn Changsha 410081 Hunan Peoples R China Hunan Normal Univ Sch Phys & Elect Sci Changsha 410081 Hunan Peoples R China Huazhong Univ Sci & Technol Sch Artificial Intelligence & Automat Wuhan 430074 Hubei Peoples R China
出 版 物:《NEUROCOMPUTING》 (Neurocomputing)
年 卷 期:2025年第624卷
核心收录:
学科分类:08[工学] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:National Natural Science Foundation of China Natural Science Foundation of Hunan Province, China [2020JJ4057] Key Research and Development Program of Changsha Science and Technology Bureau, China [kq2004050] Scientific Research Foundation of Education Department of Hunan Province of China [21A0052]
主 题:Scene text detection Arbitrary shape text Feature shuffle Attention mechanism
摘 要:Natural scene text detection has made significant progress in the era of deep learning. However, existing methods still exhibit deficiencies when faced with challenges such as complex backgrounds, extreme aspect ratios, and arbitrary-shaped text. To address these issues, we propose a segmentation-based Feature Shuffle Attention Network (FSANet) designed to enhance high-resolution feature extraction and multi-scale feature enhancement for robust detection of arbitrary-shaped text. FSANet is composed of two principal modules: (1) the High-Resolution Feature Extraction Network (FEN), which employs two Group Shuffle Blocks (GSBs) to maintain high-resolution details and promote feature interaction and information flow, and (2) the Adaptive Channel Attention Module (ACAM), which reduces background noise and redundant features by adaptively learning inter-feature correlations across scales, assigning weights to prioritize local features within a global context. Extensive experiments conducted on four public benchmark datasets show that, compared to the baseline method, the proposed method demonstrates an improved F-measure on the ICDAR2015, Total-Text, MSRA-TD500, and ICDAR2017-MLT datasets, with an average increase of 1.68%. The recall metric also consistently improves, with an average increase of 2.0%. Notably, the proposed method achieves the highest F-measure of 85.4%, 75.9% on the ICDAR2015, ICDAR2017-MLT datasets, respectively. Furthermore, on the other two datasets, the performance of the proposed method surpasses that of most existing methods, indicating that FSANet outperforms the majority of state-of-the-art approaches. The code will be publicly released at https://***/runminwang/FSANet.