This paper proposes an adaptive method for separation of foreground and background in low quality color document images. A connected component labelling is initially implemented to capture the spatially connected simi...
详细信息
This paper proposes an adaptive method for separation of foreground and background in low quality color document images. A connected component labelling is initially implemented to capture the spatially connected similar color pixels. Next, dominant background components are determined to divide the entire image into number of grids each representing local uniformity in illumination, background, etc. Finally foreground parts are located using local information around them. Several color images of old historical documents including manuscripts of high importance are used in the experiment. Apart from a qualitative evaluation, results are quantitatively compared with one popular foreground/background separation technique.
Under three-language formula, the destination address block of postal document of an Indian state is generally written in three languages: English, Hindi and the State official language. Because of inter-mixing of the...
详细信息
ISBN:
(纸本)9781424445004
Under three-language formula, the destination address block of postal document of an Indian state is generally written in three languages: English, Hindi and the State official language. Because of inter-mixing of these scripts in postal address writings, it is very difficult to identify the script by which a pin-code is written. Also, because of the writing style of different individuals some of the digits in a pin-code string may touch with its neighboring digits. Accurate segmentation of such touching components into individual digits is a difficult task. To avoid such difficulties, in this paper we proposed a tri-lingual (English, Hindi and Bangla) 6-digit full pin-code string recognition. We obtained 99.01% reliability from our proposed system when error and rejection rates are 0.83% and 15.27%, respectively.
In this paper we introduce a stroke based lexicon reduction technique in order to reduce the search space for recognition of handwritten words. The principle of this technique involves mainly two aspects of a word ima...
详细信息
In this paper we introduce a stroke based lexicon reduction technique in order to reduce the search space for recognition of handwritten words. The principle of this technique involves mainly two aspects of a word image to constitute a feature vector: one is word-length and the other is shape of the word. The length of the word image is represented by the number of specific vertical strokes present in the word image and, on the other hand, the shape of a word image is realized with the combination of both horizontal and vertical strokes. The experiment has been carried out with a database of 35,700 off-line handwritten Bangla word images. Though our proposed lexicon reduction technique is developed for recognition of Bangla handwritten words, its generalization property can easily be exploited for recognition of handwriting in other scripts also.
Social media has become an essential part of people to reflect their day to day activities including emotions, feelings, threatening and so on. This paper presents a new method for the automatic classification of beha...
详细信息
The relevance of machine learning (ML) in our daily lives is closely intertwined with its explainability. Explainability can allow end-users to have a transparent and humane reckoning of a ML scheme's capability a...
详细信息
This paper describes the results of the competition on Short answer ASsessment and Thai student SIGnature and Name COMponents recognition and Verification (SASIGCOM 2020) in conjunction with the 17th International Con...
详细信息
ISBN:
(数字)9781728199665
ISBN:
(纸本)9781728199672
This paper describes the results of the competition on Short answer ASsessment and Thai student SIGnature and Name COMponents recognition and Verification (SASIGCOM 2020) in conjunction with the 17th International Conference on Frontiers in Handwriting recognition (ICFHR 2020). The competition was aimed to automate the evaluation process short answer-based examination and record the development and gain attention to such system. The proposed competition contains three elements which are short answer assessment (recognition and marking the answers to short-answer questions derived from examination papers), student name components (first and last names) and signature verification and recognition. Signatures and name components data were collected from 100 volunteers. For the Thai signature dataset, there are 30 genuine signatures, 12 skilled and 12 simple forgeries for each writer. With Thai name components dataset, there are 30 genuine and 12 skilfully forged name components for each writer. There are 104 exam papers in the short answer assessment dataset, 52 of which were written with cursive handwriting; the rest of 52 papers were written with printed handwriting. The exam papers contain ten questions, and the answers to the questions were designed to be a few words per question. Three teams from distinguished labs submitted their systems. For short answer assessment, word spotting task was also performed. This paper analysed the results produced by their algorithms using a performance measure and defines a way forward for this subject of research. Both the datasets, along with some of the accompanying ground truth/baseline mask will be made freely available for research purposes via the TC10/TC11.
Document image enhancement is a fundamental and important stage for attaining the best performance in any document analysis assignment because there are many degradation situations that could harm document images, mak...
详细信息
Different systems for handwriting recognition use different features to represent the input text. Even after decades of research, no favorable decision on a best-practice exists and many features are carefully hand-cr...
详细信息
Different systems for handwriting recognition use different features to represent the input text. Even after decades of research, no favorable decision on a best-practice exists and many features are carefully hand-crafted. To facilitate the design phase for on-line handwriting systems, in this paper, we propose an unsupervised feature generation approach based on dissimilarity space embedding (DSE) of local neighborhoods around the points along the trajectory. DSE has high capability of discriminative representation and hence beneficial for classification. We compare the approach with a state-of-the-art feature extraction method and demonstrate its superiority.
Zero-Shot Learning(ZSL) techniques could classify a completely unseen class, which it has never seen before during training. Thus, making it more apt for any real-life classification problem, where it is not possible ...
详细信息
Zero-Shot Learning(ZSL) techniques could classify a completely unseen class, which it has never seen before during training. Thus, making it more apt for any real-life classification problem, where it is not possible to train a system with annotated data for all possible class types. This work investigates recognition of word images written in Bengali Script in a ZSL framework. The proposed approach performs Zero-Shot word recognition by coupling deep learned features procured from various CNN architectures along with 13 basic shapes/stroke primitives commonly observed in Bengali script characters. As per the notion of ZSL framework those 13 basic shapes are termed as “Signature/Semantic Attributes”. The obtained results are promising while evaluation was carried out in a Five-Fold cross-validation setup dealing with samples from 250 word classes.
Clustering algorithms have regained momentum with recent popularity of data mining and knowledge discovery approaches. To obtain good clustering in reasonable amount of time, various meta-heuristic approaches and thei...
详细信息
暂无评论