Text detection in shaky and non-shaky videos is challenging because of variations caused by day and night videos. In addition, moving objects, vehicles, and humans in the video make the text detection problems more ch...
详细信息
Answering to a query like when a particular document was printed is quite helpful in practice especially forensic purposes. This study attempts to develop a general framework that makes use of image processing and pat...
详细信息
Ability to learn from a single instance is something unique to the human species and One-shot learning algorithms try to mimic this special capability. On the other hand, despite the fantastic performance of Deep Lear...
详细信息
Most of the countries use bi-script documents. This is because every country uses its own national language and English as second/foreign language. Therefore, bi-lingual document with one language being the English an...
详细信息
This article deals with binarization of degraded document images. In the proposed approach, Canny edge image of the input degraded document image is obtained after blurring it with a Gaussian filter. Next, the gray va...
详细信息
ISBN:
(纸本)9781479952106
This article deals with binarization of degraded document images. In the proposed approach, Canny edge image of the input degraded document image is obtained after blurring it with a Gaussian filter. Next, the gray values of the two pixels of the input image at the left and right of each edge pixel are noted to form a histogram of these gray values which possesses two distinct peaks and the lowest valley between them provides the global threshold value. Each pixel with gray value greater than the above threshold is turned as background pixel. A small square window is considered around each non-background pixel and certain simple statistics are computed on the gray values of the pixels of this small window based on which the said pixel is turned either background or foreground. Such a local thresholding method at the latter stage can efficiently handle various degradations in the document. The binarized image so obtained is finally subjected to certain common post-processing operations. The proposed method has been compared with a few existing binarization techniques.
A modified Genetic Algorithm (GA) based search strategy is presented here that is computationally more efficient than the conventional GA. Here the idea is to start a GA with the chromosomes of small length. Such chro...
详细信息
Achieving high recognition rate for license plate images is challenging due to multi-type images. We present new symmetry features based on stroke width for classifying each input license image as private, taxi, cursi...
详细信息
Achieving high recognition rate for license plate images is challenging due to multi-type images. We present new symmetry features based on stroke width for classifying each input license image as private, taxi, cursive text, when they expand the symbols by writing and non-text such that an appropriate optical character recognition (OCR) can be chosen for enhancing recognition performance. The proposed method explores gradient vector flow (GVF) for defining symmetry features, namely, GVF opposite direction, stroke width distance, and stroke pixel direction. Stroke pixels in Canny and Sobel which satisfy the above symmetry features are called local candidate stroke pixels. Common stroke pixels of the local candidate stroke pixels are considered as the global candidate stroke pixels. Spatial distribution of stroke pixels in local and global symmetry are explored by generating a weighted proximity matrix to extract statistical features, namely, mean, standard deviation, median and standard deviation with respect the median. The feature matrix is finally fed to an support vector machine (SVM) classifier for classification. Experimental results on large datasets for classification show that the proposed method outperforms the existing methods. The usefulness and effectiveness of the proposed classification is demonstrated by conducting recognition experiments before and after classification.
Keyword spotting in video document images is challenging due to low resolution and complex background of video images. We propose the combination of Texture-Spatial-Features (TSF) for keyword spotting in video images ...
详细信息
There are many scripts in the world, several of which are used by hundreds of millions of people. Handwritten character recognition studies of several of these scripts are found in the literature. Different hand-craft...
详细信息
ISBN:
(纸本)9781479918065
There are many scripts in the world, several of which are used by hundreds of millions of people. Handwritten character recognition studies of several of these scripts are found in the literature. Different hand-crafted feature sets have been used in these recognition studies. However, convolutional neural network (CNN) has recently been used as an efficient unsupervised feature vector extractor. Although such a network can be used as a unified framework for both feature extraction and classification, it is more efficient as a feature extractor than as a classifier. In the present study, we performed certain amount of training of a 5-layer CNN for a moderately large class character recognition problem. We used this CNN trained for a larger class recognition problem towards feature extraction of samples of several smaller class recognition problems. In each case, a distinct Support Vector Machine (SVM) was used as the corresponding classifier. In particular, the CNN of the present study is trained using samples of a standard 50-class Bangla basic character database and features have been extracted for 5 different 10-class numeral recognition problems of English, Devanagari, Bangla, Telugu and Oriya each of which is an official Indian script. recognition accuracies are comparable with the state-of-the-art.
暂无评论