咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Focus Entirety and Perceive En... 收藏

Focus Entirety and Perceive Environment for Arbitrary-Shaped Text Detection

作     者:Han, Xu Gao, Junyu Yang, Chuang Yuan, Yuan Wang, Qi 

作者机构:North Western Polytech Univ Sch Comp Sci Xian 710072 Peoples R China Northwestern Polytech Univ Sch Artificial Intelligence Opt & Elect iOPEN Xian 710072 Peoples R China 

出 版 物:《IEEE TRANSACTIONS ON MULTIMEDIA》 (IEEE Trans Multimedia)

年 卷 期:2025年第27卷

页      面:287-299页

核心收录:

学科分类:0810[工学-信息与通信工程] 0808[工学-电气工程] 08[工学] 0835[工学-软件工程] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

基  金:National Natural Science Foundation of China [U21B2041  62471394  62306241] 

主  题:Feature extraction Kernel Finite element analysis Noise Text detection Head Data mining Predictive models Ions Image segmentation Scene text detection arbitrary-shaped text real-time detection 

摘      要:Due to the diversity of scene text in aspects such as font, color, shape, and size, accurately and efficiently detecting text is still a formidable challenge. Among the various detection approaches, segmentation-based approaches have emerged as prominent contenders owing to their flexible pixel-level predictions. However, these methods typically model text instances in a bottom-up manner, which is highly susceptible to noise. In addition, the prediction of pixels is isolated without introducing pixel-feature interaction, which also influences the detection performance. To alleviate these problems, we propose a multi-information level arbitrary-shaped text detector consisting of a focus entirety module (FEM) and a perceive environment module (PEM). The former extracts instance-level features and adopts a top-down scheme to model texts to reduce the influence of noises. Specifically, it assigns consistent entirety information to pixels within the same instance to improve their cohesion. In addition, it emphasizes the scale information, enabling the model to distinguish varying scale texts effectively. The latter extracts region-level information and encourages the model to focus on the distribution of positive samples in the vicinity of a pixel, which perceives environment information. It treats the kernel pixels as positive samples and helps the model differentiate text and kernel features. Extensive experiments demonstrate the FEM s ability to efficiently support the model in handling different scale texts and confirm the PEM can assist in perceiving pixels more accurately by focusing on pixel vicinities. Comparisons show the proposed model outperforms existing state-of-the-art approaches on four public datasets.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分