In document image analysis and especially in handwritten document image recognition, standard datasets play vital roles for evaluating performances of algorithms and comparing results obtained by different groups of r...
详细信息
In document image analysis and especially in handwritten document image recognition, standard datasets play vital roles for evaluating performances of algorithms and comparing results obtained by different groups of researchers. In this paper, an unconstrained Persian handwritten text dataset (PHTD) is introduced. The PHTD contains 140 handwritten documents of three different categories written by 40 individuals. Total number of text-lines and words/subwords in the dataset are 1787 and 27073, respectively. In most of the PHTD documents either an overlapping or a touching text-lines is present. The average number of text-lines in documents of the PHTD is 13. Two types of ground truths based on pixels information and content information are generated for the dataset. Providing these two types of ground truths for the PHTD, it can be utilized in many areas of document image processing such as sentence recognition/understanding, text-line segmentation, word segmentation, word recognition, and character segmentation. To provide a framework for other researches, recent text-line segmentation results on this dataset are also reported.
This paper proposes the use of a new symmetry property based on proximity of the median moments in the wavelet domain. The method divides a given frame into 16 equally sized blocks to classify the true text frame. The...
详细信息
This paper proposes the use of a new symmetry property based on proximity of the median moments in the wavelet domain. The method divides a given frame into 16 equally sized blocks to classify the true text frame. The average of high frequency subbands of a block is used for computing median moments to brighten the text pixel in a block of video frame. Then K-means clustering with K=2 is applied on the median moments of the block to classify it as a probable text block. For classified blocks, average wavelet median moments are computed for a sliding window. We introduce Max-Min cluster to classify the probable text pixel in each probable text block. The four quadrants are formed from the centroid of the probable text pixels. The new concept called symmetry is introduced to identify the true text block based on proximity between probable text pixels in each quadrant. If the frame produces at least one true text block, it is considered as a text frame otherwise a non-text frame. The method is tested on three datasets to evaluate the robustness of the method in classification of text frames in terms of recall and precision.
Segmentation of a text-document into lines, words and characters, which is considered to be the crucial preprocessing stage in Optical Character recognition (OCR) is traditionally carried out on uncompressed documents...
详细信息
Segmentation of a text-document into lines, words and characters, which is considered to be the crucial preprocessing stage in Optical Character recognition (OCR) is traditionally carried out on uncompressed documents, although most of the documents in real life are available in compressed form, for the reasons such as transmission and storage efficiency. However, this implies that the compressed image should be decompressed, which indents additional computing resources. This limitation has motivated us to take up research in document image analysis using compressed documents. In this paper, we think in a new way to carry out segmentation at line, word and character level in run-length compressed printed-text-documents. We extract the horizontal projection profile curve from the compressed file and using the local minima points perform line segmentation. However, tracing vertical information which leads to tracking words-characters in a run-length compressed file is not very straight forward. Therefore, we propose a novel technique for carrying out simultaneous word and character segmentation by popping out column runs from each row in an intelligent sequence. The proposed algorithms have been validated with 1101 text-lines, 1409 words and 7582 characters from a data-set of 35 noise and skew free compressed documents of Bengali, Kannada and English Scripts.
Because of writing styles of different individuals, some of the text-lines may be curved in shape. For recognition of such text-lines, their proper alignment is necessary. In this paper, we propose a text-line alignme...
详细信息
Because of writing styles of different individuals, some of the text-lines may be curved in shape. For recognition of such text-lines, their proper alignment is necessary. In this paper, we propose a text-line alignment technique based on painting algorithm. Here at first, Piece-wise Painting Algorithm (PPA) is used to get a number of black and white rectangular patches all along the text-line for text-line alignment. Identifying the degree of oscillation of the input text-line, some candidate pixels are also obtained based on horizontal projection and center points of the black patches. Using the degree of oscillation of the input text image and the candidate pixels a curve or straight line is fit to trace the baseline. Subsequently, all components of the text-line are deskewed based on analyzing the characteristic of the fit curve or line to align the components with respect to the horizontal imaginary baseline. The proposed algorithm was evaluated with 128 Persian handwritten text-lines containing 4317 sub words. Experimental analysis showed that 92.31% of the sub words were accurately aligned. Further, the proposed algorithm was tested with another Persian handwritten text-lines dataset [6] and remarkable results were achieved.
In this paper, we propose two types of feature sets based on modified chain-code direction frequencies in the contour pixels of input image and modified transition features (horizontally and vertically). A multi-level...
详细信息
ISBN:
(纸本)9781424445004
In this paper, we propose two types of feature sets based on modified chain-code direction frequencies in the contour pixels of input image and modified transition features (horizontally and vertically). A multi-level support vector machine (SVM) is proposed as classifier to recognize Persian isolated digits. In first level, we combine similar shaped numerals into a single group and as result; we obtain 7 classes instead of 10 classes. We compute 196-dimension chain-code direction frequencies as features to discriminate 7 classes. In the second level, classes containing more than one numeral because of high resemblance in their shapes are considered. We use modified transition features (horizontally and vertically) for discriminating between two overlapping classes (0 and 1). To separate another overlapping group containing three numerals 2, 3 and 4 we first eliminate common parts of these digits (tail) and then compute chain code features. We employ SVM classifier for the classification and evaluate our scheme on 80,000 handwritten samples of Persian numerals [10]. Using 60,000 samples for training, we tested our scheme on other 20,000 samples and obtained 99.02% accuracy.
There are only a few studies undertaken in developing automatic assessment systems using handwriting recognition, even though a successful system would undoubtedly benefit the education system as schools and universit...
详细信息
There are only a few studies undertaken in developing automatic assessment systems using handwriting recognition, even though a successful system would undoubtedly benefit the education system as schools and universities in many countries still employ paper-based examinations. To the best of the authors' knowledge, there is no existing work on an automatic off-line short answer assessment system comprising a student identification component. Hence in this paper, the authors propose a system towards this, where a new feature extraction technique called the Enhanced Water Reservoir, Loop and Gaussian Grid Feature, as well as other enhanced feature extraction techniques were utilised. Artificial Neural Networks and Support Vector Machines were employed as the classifiers; they were used for the investigation, and a comparison of the recognition and accuracy rates of the proposed systems, as well as the feature extraction techniques, was undertaken. The proposed assessment system achieved a recognition rate of 87.12% with 91.12% assessment accuracy, and the student identification component obtained a recognition rate of 99.52% with a 100% identification accuracy rate.
This paper describes a linguistics approach towards development of a Bengali Noun Morphological Analyzer implemented at first on the semi-manually created database of 87697 inflected words list tokens, i.e. Input2 for...
详细信息
ISBN:
(纸本)9781467361262
This paper describes a linguistics approach towards development of a Bengali Noun Morphological Analyzer implemented at first on the semi-manually created database of 87697 inflected words list tokens, i.e. Input2 for Linguistics Resource Creation comprising of Noun, Pronoun, Adjective roots with and without its suffixes. Then after the first implementation the developed Linguistic Resource knowledge is applied on an unknown Bengali corpus database containing 6157 tokens. At the initial stage of this research a linguistic analysis is done which leads to framing of the nominal suffix list which is later on used in nominal suffix extraction. This linguistic knowledge is implemented in developing the finite-state transducer grammar for Linguistic Resource which gives way to the development of Bengali Noun Morphological Analyzer. The final output obtained is around 44% accuracy. This accuracy can be always improved with time if we keep on increasing the nominal roots in the FST grammar file.
Fine-grained Visual Categorization (FGVC) is an open problem in computervision due to subtle differences between categories. The present paper demonstrates that Collaborative Representation based Classification (CRC)...
详细信息
ISBN:
(纸本)9781509027491
Fine-grained Visual Categorization (FGVC) is an open problem in computervision due to subtle differences between categories. The present paper demonstrates that Collaborative Representation based Classification (CRC) can address this problem successfully. Instead of the traditional discriminative approach of classification, CRC takes a co-operative approach by representing the query image as a weighted collaboration of training images across all classes in the feature space. The superior performance of CRC compared to some other modern classifiers including SVM is shown in this work using several popular descriptors like GIST+Color, SIFT and CNN features with Species recognition chosen as the representative FGVC problem. Besides experiments on the Oxford 102 Flowers and CUB200-2011 Bird benchmarks, the present work also introduces a new challenging dataset NZ Birds v1.0 with 600 images of 30 New Zealand endemic and native bird species.
Off-line automatic assessment systems can be an aid for teachers in the marking process. There has been no recent work in the development of off-line automatic assessment systems using handwriting recognition, even th...
详细信息
ISBN:
(纸本)9781479919611
Off-line automatic assessment systems can be an aid for teachers in the marking process. There has been no recent work in the development of off-line automatic assessment systems using handwriting recognition, even though such systems will clearly benefit the education sector. The reason is many schools and universities in many parts of the world still use paper-based examination. This research proposes the use of a newly developed feature extraction technique called the Modified Water Reservoir, Loop and Gaussian Grid Feature, as well as other feature extraction techniques. These techniques were investigated employing artificial neural networks and support vector machines as classifiers to develop an automatic assessment system for marking short answer questions. The system has high assessment accuracy (up to 94.75% for hand printed, 96.09% for cursive handwritten, and 95.71% for hand printed and cursive handwritten combined). The proposed system also includes assessment criteria to augment its accuracy.
A trend towards capturing or filming images using cellphone and sharing images on social media is a part and parcel of day to day activities of humans. When an image is forwarded several times in social media it may b...
详细信息
暂无评论