Automatic identification of an individual based on his/her handwriting characteristics is an important forensic tool. In a computational forensic scenario, presence of huge amount of text/information in a questioned d...
详细信息
In this article, we present a robust scheme for detection of Devanagari or Bangla texts in scene images. These are the two most popular scripts in India. The proposed scheme is primarily based on two major characteris...
详细信息
The goal of this project is to build a Contextual Suggestion system that will recommend the usel a lanked list of suggestions depending on user's context as well as her preferences. In this context we prent an alg...
In this paper we introduce a stroke based lexicon reduction technique in order to reduce the search space for recognition of handwritten words. The principle of this technique involves mainly two aspects of a word ima...
详细信息
Automatic identification of an individual based on his/her handwriting characteristics is an important forensic tool. In a computational forensic scenario, presence of huge amount of text/information in a questioned d...
详细信息
Automatic identification of an individual based on his/her handwriting characteristics is an important forensic tool. In a computational forensic scenario, presence of huge amount of text/information in a questioned document cannot be ensured. Lack of data threatens system reliability in such cases. We here propose a writer identification system for Oriya script which is capable of performing reasonably well even with small amount of text. Experiments with curvature feature are reported here, using Support Vector Machine (SVM) as classifier. We got promising results of 94.00% writer identification accuracy at first top choice and 99% when considering first three top choices.
In recent years, many techniques for the recognition of Persian/Arabic handwritten documents have been proposed by researchers. To test the promises of different features extraction and classification methods and to p...
详细信息
In recent years, many techniques for the recognition of Persian/Arabic handwritten documents have been proposed by researchers. To test the promises of different features extraction and classification methods and to provide a new benchmark for future research, in this paper a comparative study of Persian/Arabic handwritten character recognition using different feature sets and classifiers is presented. Feature sets used in this study are computed based on gradient, directional chain code, shadow, under-sampled bitmap, intersection/junction/endpoint, and line-fitting information. Support Vector Machines (SVMs), Nearest Neighbour (NN), k-Nearest Neighbour (k-NN) are used as different classifiers. We evaluated the proposed systems on a standard dataset of Persian handwritten characters. Using 36682 samples for training, we tested the proposed recognition systems on other 15338 samples and their detailed results are reported. The best correct recognition of 96.91% is obtained in this comparative study.
Font can be used as a notion of similarity amongst multiple documents written in same script. We could automatically retrieve document images with specific font from a huge digital document repository. So Optical Font...
详细信息
ISBN:
(纸本)9781467322164
Font can be used as a notion of similarity amongst multiple documents written in same script. We could automatically retrieve document images with specific font from a huge digital document repository. So Optical Font recognition could be a useful pre-processing step in an automated questioned document analysis system for sorting documents with similar fonts. We propose a scheme to identify 10 different fonts for an Indic script (Bangla). Curvature-based features are extracted from segmented characters and are fed to a Support Vector Machine (SVM) classifier. The classifier determines the font type for each segmented character obtained from a document. Later, font identification for that document is executed on the basis of majority voting amongst 10 different fonts for all characters. Using a Multiple Kernel SVM classifier we obtained 98.5% accuracy from 400 test documents (40 documents for each font type).
Classification/misclassification of similar shaped characters largely affects OCR accuracy. Sometimes occlusion/insertion of a part of character (due to inferior scanning quality) also makes it look alike another char...
详细信息
Classification/misclassification of similar shaped characters largely affects OCR accuracy. Sometimes occlusion/insertion of a part of character (due to inferior scanning quality) also makes it look alike another character type. For such adverse situations a part based character recognition system could be more effective. In order to encounter mentioned adverse scenario we propose a new feature encoding technique. This feature encoding is based on the amalgamation of Gabor filter-based features with SURF features (G-SURF). Features generated from a character are provided to Support Vector Machine (SVM) classifier. We obtained an encouraging accuracy on similar shaped characters from three different scripts.
Image re-ranking aims at improving the precision of keyword-based image retrieval, mainly by introducing visual features to re-rank. Many existing approaches require offline training for every keyword, which are unsui...
详细信息
In this article, we present a novel set of features for detection of text in images of natural scenes using a multi-layer perceptron (MLP) classifier. An estimate of the uniformity in stroke thickness is one of our fe...
详细信息
In this article, we present a novel set of features for detection of text in images of natural scenes using a multi-layer perceptron (MLP) classifier. An estimate of the uniformity in stroke thickness is one of our features and we obtain the same using only a subset of the distance transform values of the concerned region. Estimation of the uniformity in stroke thickness on the basis of sparse sampling of the distance transform values is a novel approach. Another feature is the distance between the foreground and background colors computed in a perceptually uniform and illumination-invariant color space. Remaining features include two ratios of anti-parallel edge gradient orientations, a regularity measure between the skeletal representation and Canny edgemap of the object, average edge gradient magnitude, variation in the foreground gray levels and five others. Here, we present the results of the proposed approach on the ICDAR 2003 database and another database of scene images consisting of text of Indian scripts.
暂无评论