Text image super-resolution technology is widely used in the preprocessing stage of tasks such as scene text recognition to improve the readability of text images to humans. In order to facilitate important tasks and ...
详细信息
ISBN:
(数字)9798350387780
ISBN:
(纸本)9798350387797
Text image super-resolution technology is widely used in the preprocessing stage of tasks such as scene text recognition to improve the readability of text images to humans. In order to facilitate important tasks and applications such as OCR recognition in the future, this paper proposes a method to improve the super-resolution of text images. The specific research work is as follows: 1. A research on Text Image Super Resolution algorithm (Research on Text Image Super Resolution algorithm Based on Deformable Attention Transformer) is proposed to address the issues of high memory and computational costs caused by the use of dense attention in text image super-resolution, the influence of irrelevant parts beyond the region of interest on features, and text deformation and distortion. This algorithm mainly uses Deformable Attention Transformer as an important part of the backbone network to generate super-resolution (SR) images [1]. 2. This article proposes introducing layout analysis in the field of intelligent document processing, which can recognize common layout elements in document images, including text, titles, and other elements, into Text Image Layout Label Module (TILL). This article uses the Image processing Challenge dataset and TextZoom dataset, which can recognize the layout of text images in the synthesized dataset. In subsequent operations, the scale of image super-resolution can be adjusted as needed, which can improve the accuracy and effectiveness of image super-resolution. 3. Finally, the stroke position perception module (SPPM) and stroke content perception module (SCPM) were introduced [2]. The stroke position perception module can reduce the problem of character adhesion in the image reconstructed by the super-resolution network of the text image and enhance the feature extraction performance of the model by introducing character position information guidance, Enable the model to better focus on stroke position information. The stroke content per
暂无评论