Compared with other video semantic clues, such as gestures, motions etc., video text generally provides highly useful and fairly precise semantic information, the analysis of which can to a great extent facilitate vid...
详细信息
Compared with other video semantic clues, such as gestures, motions etc., video text generally provides highly useful and fairly precise semantic information, the analysis of which can to a great extent facilitate video and scene understanding. It can be observed that the video texts show stronger edges. The Nonsubsampled Contourlet Transform (NSCT) is a fully shift-invariant, multi-scale, and multi-direction expansion, which can preserve the edge/silhouette of the text characters well. Therefore, in this paper, a new approach has been proposed to detect video text based on NSCT. First of all, the 8 directional coefficients of NSCT are combined to build the directional edge map (DEM), which can keep the horizontal, vertical and diagonal edge features and suppress other directional edge features. Then various directional pixels of DEM are integrated into a whole binary image (BE). Based on the BE, text frame classification is carried out to determine whether the video frames contain the text lines. Finally, text detection based on the BE is performed on consecutive frames to discriminate the video text from non-text regions. Experimental evaluations based on our collected TV videos data set demonstrate that our method significantly outperforms the other 3 video text detection algorithms in both detection speed and accuracy, especially when there are challenges such as video text with various sizes, languages, colors, fonts, short or long text lines.
text in video frames provides brief and important content information which is helpful to video scene understanding, annotation and searching. A new text detection method in video frames is proposed in this paper. Fir...
详细信息
ISBN:
(纸本)9781424447053
text in video frames provides brief and important content information which is helpful to video scene understanding, annotation and searching. A new text detection method in video frames is proposed in this paper. First, a small overlapped sliding window is scanned over the frame from which hybrid features are extracted. And then SVM classifier is employed to distinguish the text from background. At last, vote mechanism and morphological filter are performed to precisely locate the text region. The new method is expected to outperform the existing strategies based on the following two improvements. One is selecting robust features to distinguish both the scene and overlay text from the complex backgrounds. The other is addressing the multilingual capability over the whole processing. The proposed algorithm has been evaluated by four different kinds of videos and the experiments show its high performance.
We present a text detection method in natural scene images based on two-stage nontext filtering. Firstly, we detect multi-channel maximally stable extremal regions (MSERs) as character candidates. To reduce the amount...
详细信息
ISBN:
(纸本)9781479918058
We present a text detection method in natural scene images based on two-stage nontext filtering. Firstly, we detect multi-channel maximally stable extremal regions (MSERs) as character candidates. To reduce the amount of repeating components, we merge the MSERs by choosing the most character-like ones when overlap happens. Then nontext components are filtered out by a two-stage labeling procedure, wherein we combine random forests with CRF. Finally, components labeled as text are grouped into words by an edge-cut strategy, and false positives are eliminated by a HOG-based classifier. The experimental results on the ICDAR2013 database show the effectiveness of the proposed method.
This paper reviews one of a noteworthy objective in research range of computer vision and digital image processing, which are text detection process and text recognition process. text detection and text recognition ar...
详细信息
ISBN:
(纸本)9781538621820
This paper reviews one of a noteworthy objective in research range of computer vision and digital image processing, which are text detection process and text recognition process. text detection and text recognition are one of the popular research fields. The result of the study is useful for the visually impaired people, because it may help them in buying and choosing their preferred product in the market. This study applied Multiple Phase (MP) method for the text detection process and the text recognition process. The text detection was going through some phases which are the detection of Maximally Stable Extremal Regions (MSER), Canny edge detection, region filtering, and Optical Character Recognition (OCR). The OCR was used for text recognition process. The experimental result for our proposed method performance was 80.88%, which was better compared to the previous research which used the two-stage classifier that was only 69% of performance.
In this paper, a new framework for detecting text from webpage and email images is presented. The original image is split into multiple layer images based on the maximum gradient difference (MGD) values to detect text...
详细信息
ISBN:
(纸本)9781479903566
In this paper, a new framework for detecting text from webpage and email images is presented. The original image is split into multiple layer images based on the maximum gradient difference (MGD) values to detect text with both strong and weak contrasts. Connected component processing and text detection are performed in each layer image. A novel texture descriptor named T-LBP, is proposed to further filter out non-text candidates with a trained SVM classifier. The ICDAR 2011 born-digital image dataset is used to evaluate and demonstrate the performance of the proposed method. Following the same performance evaluation criteria, the proposed method outperforms the winner algorithm of the ICDAR 2011 Robust Reading Competition Challenge 1.
Focusing on scene text detection, this paper adopts an improved scene text detection algorithm based on YOLOv3 network. Firstly, for a single detection target, the training speed of Darknet53 backbone network used by ...
详细信息
ISBN:
(纸本)9781728111902
Focusing on scene text detection, this paper adopts an improved scene text detection algorithm based on YOLOv3 network. Firstly, for a single detection target, the training speed of Darknet53 backbone network used by YOLOv3 is slow due to too many layers, so that this paper proposes a method for replacing it by Darknet19. Secondly, multi-scale detection was retained in the original network, and three anchors of different sizes were used to predict the bounding boxes. The experiment results show that the training speed of this algorithm is greatly improved and the detection effect is accurate and stable.
A method of detecting text regions in images which combines grayscale decomposition and stroke extraction is proposed. By checking the consistency of the two text features, text-like connected components are grouped t...
详细信息
ISBN:
(纸本)9781424441990
A method of detecting text regions in images which combines grayscale decomposition and stroke extraction is proposed. By checking the consistency of the two text features, text-like connected components are grouped together to generate text line regions in the processed image. It shows good performance on efficiently detecting image text rendered in relatively complex backgrounds.
In this paper we have implemented a two stage frame work to remove unwanted text from images: first stage is to detect text from image and second stage is to remove that text using method of inpainting. To detect text...
详细信息
ISBN:
(纸本)9781479962723
In this paper we have implemented a two stage frame work to remove unwanted text from images: first stage is to detect text from image and second stage is to remove that text using method of inpainting. To detect text, text localization and extraction is carried out followed by method of inpainting to fill the holes generated in image using surrounding region. To bring more efficiency in text detection smoothing works better. With the use of feature extraction, stroke filtering and centroid processing text detection is performed. Color histogram processing is carried out to carry clearer filtering of text components from other image components. In last stage, the text holes generated are filled with appropriate information present in same image using nearest matching neighborhood inpainting. To generate visually plausible region filling results, smoothing is carried out on the selected filled patches. We tested the implementation using different images, and compared the results of smoothing with results of implementation without smoothing. Experimental results show improved PSNR due to smoothing.
In this paper, a novel method to automatically detect the texts embedded in medical images is proposed. Specific local features for texts in medical images, such as local edge density, local intensity contrast, and co...
详细信息
ISBN:
(纸本)9781467376822
In this paper, a novel method to automatically detect the texts embedded in medical images is proposed. Specific local features for texts in medical images, such as local edge density, local intensity contrast, and connectivity, are defined and extracted to find out the candidate text regions. Then the histograms of oriented gradient (HOG) for all candidate regions are calculated. With both the HOG features and the aforementioned local features, an adaptive boosting (AdaBoost) classifier is used to discriminate the texts from non-text structures. Experimental results show that the proposed method has better text detection performance compared with previous methods. It can preserve the text information and eliminate the obstruction caused by different sources. The detected texts can provide additional information in many applications such as medical image retrieval.
In recent years, the importance of text detection in imagery has been increasing due to the great number of applications developed for mobile devices. text detection becomes complicated when backgrounds are complex or...
详细信息
ISBN:
(数字)9781510612501
ISBN:
(纸本)9781510612501;9781510612495
In recent years, the importance of text detection in imagery has been increasing due to the great number of applications developed for mobile devices. text detection becomes complicated when backgrounds are complex or capture conditions are not controlled. In this work, a method for text detection in natural scenes is proposed. The method is based on the Phase Congruency approach, obtained via Scale-Space Monogenic signal framework. The proposed method is robust to geometrical distortions, resolution, illumination, and noise degradation. Finally, experimental results are presented using a natural scene dataset.
暂无评论