Automatic identification of an individual based on his/her handwriting characteristics is an important forensic tool. In a computational forensic scenario, presence of huge amount of text/information in a questioned d...
详细信息
ISBN:
(纸本)9781424475421
Automatic identification of an individual based on his/her handwriting characteristics is an important forensic tool. In a computational forensic scenario, presence of huge amount of text/information in a questioned document cannot be always ensured. Also, compromising in terms of systems reliability under such situation is not desirable. We here propose a system to encounter such adverse situation in the context of Bengali script. Experiments with discrete directional feature and gradient feature are reported here, along with Support Vector Machine (SVM) as classifier. We got promising results of 95.19% writer identification accuracy at first top choice and 99.03% when considering first three top choices.
All Han-based scripts (Chinese, Japanese, and Korean) possess similar visual characteristics. Hence system development for identification of Chinese, Japanese and Korean scripts from a single document page is quite ch...
详细信息
All Han-based scripts (Chinese, Japanese, and Korean) possess similar visual characteristics. Hence system development for identification of Chinese, Japanese and Korean scripts from a single document page is quite challenging. It is noted that a Han-based document page might also have Roman script in them. A multi-script OCR system dealing with Chinese, Japanese, Korean, and Roman scripts, demands identification of scripts before execution of respective OCR modules. We propose a system to address this problem using directional features along with a Gaussian Kernel-based Support Vector Machine. We got promising results of 98.39% script identification accuracy at character level and 99.85% at block level, when no rejection was considered.
Because of multi-lingual behavior destination address block of a postal document of an Indian state may be written in two or more scripts. From a statistical analysis of Indian postal document we noted that about 22.0...
详细信息
ISBN:
(纸本)9781424475421
Because of multi-lingual behavior destination address block of a postal document of an Indian state may be written in two or more scripts. From a statistical analysis of Indian postal document we noted that about 22.04% of Indian postal documents are written in two scripts. Because of inter-mixing of these scripts in postal address writings, it is very difficult to identify the script by which a city name is written. To avoid such identification difficulties, in this paper we proposed a lexicon-driven bi-lingual (English and Bangla) city name recognition scheme for Indian postal automation. We obtained 93.19% accuracy when tested on 11875 city name samples.
Most of the countries use bi-script documents. This is because every country uses its own national language and English as second/foreign language. Therefore, bi-lingual document with one language being the English an...
详细信息
Most of the countries use bi-script documents. This is because every country uses its own national language and English as second/foreign language. Therefore, bi-lingual document with one language being the English and other being the national language is very common. Postal documents are a very good example of such bi-lingual/script document. This paper deals with word-wise handwritten script identification from bi-script documents written in Persian and Roman. In the proposed scheme, simple but fast computable set of 12 features based on fractal dimension, position of small component, topology etc. are used and a set of classifiers are employed for script identification experiments. We tested our scheme on a dataset of 5000 handwritten Persian and English words and 99.20% of correct script identification is obtained.
With the increasing popularity of digital cameras attached with various handheld devices, many new computational challenges have gained significance. One such problem is extraction of texts from natural scene images c...
详细信息
Automatic Text/symbols retrieval in graphical documents (map, engineering drawing) involves many challenges because they are not usually parallel to each other. They are multi-oriented and curve in nature to annotate ...
详细信息
ISBN:
(纸本)9789898111692
Automatic Text/symbols retrieval in graphical documents (map, engineering drawing) involves many challenges because they are not usually parallel to each other. They are multi-oriented and curve in nature to annotate the graphical curve lines and hence follow a curvi-linear way too. Sometimes, text and symbols frequently touch/overlap with graphical components (river, street, border line) which enhances the problem. For OCR of such documents we need to extract individual text lines and their corresponding words/characters. In this paper, we propose a methodology to extract individual text lines and an approach for recognition of the extracted text characters from such complex graphical documents. The methodology is based on the foreground and background information of the text components. To take care of background information, water reservoir concept and convex hull have been used. For recognition of multi-font, multi-scale and multi-oriented characters, Support Vector Machine (SVM) based classifier is applied. Circular ring and convex hull have been used along with angular information of the contour pixels of the characters to make the feature rotation and scale invariant.
This paper presents a pioneering effort towards machine authentication of security documents like bank cheques, legal deeds, certificates, etc. that fall under the same class as far as security is concerned. The propo...
详细信息
In this paper, we propose two types of feature sets based on modified chain-code direction frequencies in the contour pixels of input image and modified transition features (horizontally and vertically). A multi-level...
详细信息
In some Thai documents, a single text line of a printed document page may contain words of both Thai and Roman scripts. For the Optical Character recognition (OCR) of such a document page it is better to identify, at ...
详细信息
In recent years research towards Indian handwritten character recognition is getting increasing attention. Many approaches have been proposed by the researchers towards handwritten Indian character recognition and man...
详细信息
暂无评论