OCR errors hurt retrieval performance to a great extent. Research has been done on modelling and correction of OCR errors. However, most of the existing systems use language dependent resources or training texts for s...
详细信息
Text detection in shaky and non-shaky videos is challenging because of variations caused by day and night videos. In addition, moving objects, vehicles, and humans in the video make the text detection problems more ch...
详细信息
Answering to a query like when a particular document was printed is quite helpful in practice especially forensic purposes. This study attempts to develop a general framework that makes use of image processing and pat...
详细信息
Ability to learn from a single instance is something unique to the human species and One-shot learning algorithms try to mimic this special capability. On the other hand, despite the fantastic performance of Deep Lear...
详细信息
Most of the countries use bi-script documents. This is because every country uses its own national language and English as second/foreign language. Therefore, bi-lingual document with one language being the English an...
详细信息
This article deals with binarization of degraded document images. In the proposed approach, Canny edge image of the input degraded document image is obtained after blurring it with a Gaussian filter. Next, the gray va...
详细信息
ISBN:
(纸本)9781479952106
This article deals with binarization of degraded document images. In the proposed approach, Canny edge image of the input degraded document image is obtained after blurring it with a Gaussian filter. Next, the gray values of the two pixels of the input image at the left and right of each edge pixel are noted to form a histogram of these gray values which possesses two distinct peaks and the lowest valley between them provides the global threshold value. Each pixel with gray value greater than the above threshold is turned as background pixel. A small square window is considered around each non-background pixel and certain simple statistics are computed on the gray values of the pixels of this small window based on which the said pixel is turned either background or foreground. Such a local thresholding method at the latter stage can efficiently handle various degradations in the document. The binarized image so obtained is finally subjected to certain common post-processing operations. The proposed method has been compared with a few existing binarization techniques.
A modified Genetic Algorithm (GA) based search strategy is presented here that is computationally more efficient than the conventional GA. Here the idea is to start a GA with the chromosomes of small length. Such chro...
详细信息
Achieving high recognition rate for license plate images is challenging due to multi-type images. We present new symmetry features based on stroke width for classifying each input license image as private, taxi, cursi...
详细信息
Achieving high recognition rate for license plate images is challenging due to multi-type images. We present new symmetry features based on stroke width for classifying each input license image as private, taxi, cursive text, when they expand the symbols by writing and non-text such that an appropriate optical character recognition (OCR) can be chosen for enhancing recognition performance. The proposed method explores gradient vector flow (GVF) for defining symmetry features, namely, GVF opposite direction, stroke width distance, and stroke pixel direction. Stroke pixels in Canny and Sobel which satisfy the above symmetry features are called local candidate stroke pixels. Common stroke pixels of the local candidate stroke pixels are considered as the global candidate stroke pixels. Spatial distribution of stroke pixels in local and global symmetry are explored by generating a weighted proximity matrix to extract statistical features, namely, mean, standard deviation, median and standard deviation with respect the median. The feature matrix is finally fed to an support vector machine (SVM) classifier for classification. Experimental results on large datasets for classification show that the proposed method outperforms the existing methods. The usefulness and effectiveness of the proposed classification is demonstrated by conducting recognition experiments before and after classification.
Keyword spotting in video document images is challenging due to low resolution and complex background of video images. We propose the combination of Texture-Spatial-Features (TSF) for keyword spotting in video images ...
详细信息
暂无评论