Answering to a query like when a particular document was printed is quite helpful in practice especially forensic purposes. This study attempts to develop a general framework that makes use of image processing and pat...
详细信息
Answering to a query like when a particular document was printed is quite helpful in practice especially forensic purposes. This study attempts to develop a general framework that makes use of image processing and pattern recognition principles for ink age determination in printed documents. The approach, at first, computationally extracts a set of suitable color features and then analyzes them to properly associate them with ink age. Finally, a neural net is designed and trained to determine ages of unknown samples. The dataset used for the present experiment consists of the cover pages of LIFE magazines published in between 1930's and 70's (five decades). Test results show that a viable framework for involving machines in assisting human experts for determining age of printed documents.
Because of multi-lingual behavior destination address block of a postal document of an Indian state may be written in two or more scripts. From a statistical analysis of Indian postal document we noted that about 22.0...
详细信息
ISBN:
(纸本)9781424475421
Because of multi-lingual behavior destination address block of a postal document of an Indian state may be written in two or more scripts. From a statistical analysis of Indian postal document we noted that about 22.04% of Indian postal documents are written in two scripts. Because of inter-mixing of these scripts in postal address writings, it is very difficult to identify the script by which a city name is written. To avoid such identification difficulties, in this paper we proposed a lexicon-driven bi-lingual (English and Bangla) city name recognition scheme for Indian postal automation. We obtained 93.19% accuracy when tested on 11875 city name samples.
With the increasing popularity of digital cameras attached with various handheld devices, many new computational challenges have gained significance. One such problem is extraction of texts from natural scene images c...
详细信息
Automatic Text/symbols retrieval in graphical documents (map, engineering drawing) involves many challenges because they are not usually parallel to each other. They are multi-oriented and curve in nature to annotate ...
详细信息
ISBN:
(纸本)9789898111692
Automatic Text/symbols retrieval in graphical documents (map, engineering drawing) involves many challenges because they are not usually parallel to each other. They are multi-oriented and curve in nature to annotate the graphical curve lines and hence follow a curvi-linear way too. Sometimes, text and symbols frequently touch/overlap with graphical components (river, street, border line) which enhances the problem. For OCR of such documents we need to extract individual text lines and their corresponding words/characters. In this paper, we propose a methodology to extract individual text lines and an approach for recognition of the extracted text characters from such complex graphical documents. The methodology is based on the foreground and background information of the text components. To take care of background information, water reservoir concept and convex hull have been used. For recognition of multi-font, multi-scale and multi-oriented characters, Support Vector Machine (SVM) based classifier is applied. Circular ring and convex hull have been used along with angular information of the contour pixels of the characters to make the feature rotation and scale invariant.
The Fisher kernel is a generic framework which combines the benefits of generative and discriminative approaches to pattern classification. In this contribution, we propose to apply this framework to handwritten word-...
详细信息
In this paper, we propose two types of feature sets based on modified chain-code direction frequencies in the contour pixels of input image and modified transition features (horizontally and vertically). A multi-level...
详细信息
This paper presents a pioneering effort towards machine authentication of security documents like bank cheques, legal deeds, certificates, etc. that fall under the same class as far as security is concerned. The propo...
详细信息
In some Thai documents, a single text line of a printed document page may contain words of both Thai and Roman scripts. For the Optical Character Recognition (OCR) of such a document page it is better to identify, at ...
详细信息
We propose a new method for handwritten word-spotting which does not require prior training or gathering examples for querying. More precisely, a model is trained "on the fly" with images rendered from the s...
详细信息
暂无评论