This paper presents a system for script recognition of the text appearing in video frames. The textual content in videos is generally extracted and recognized for development of text based indexing and retrieval syste...
详细信息
This paper presents a system for script recognition of the text appearing in video frames. The textual content in videos is generally extracted and recognized for development of text based indexing and retrieval systems. If the text in videos appears only in a single script, the output of text detector is directly fed to a video Optical Character recognition (OCR) system for recognition. However, in cases where text may appear in multiple scripts, a script recognition module is required to recognize the script of the text so that it can be processed by the respective OCR. We propose a video script recognition system that considers text in each script as a unique texture. A number of texture measures are extracted from text blocks and an artificial neural network is trained to learn to distinguish between different scripts. The system evaluated on video text blocks in five different scripts (Arabic, English, Urdu, Hindi and Chinese) reported promising recognition rates. In addition to the performance of individual textural features, different combinations of texture measures were investigated which realized interesting results.
Prediction of gender and other demographic attributes of individuals from handwriting samples offers an interesting basic, as well as applied research problem. The correlation between gender and the visual appearance ...
详细信息
ISBN:
(纸本)9781509009824
Prediction of gender and other demographic attributes of individuals from handwriting samples offers an interesting basic, as well as applied research problem. The correlation between gender and the visual appearance of handwriting has been validated by a number of studies and the present study is based on the same idea. We exploit the textural measurements as the discriminating attribute between male and female writings. The textural information in a writing is captured by applying a bank of Gabor filters to the image of handwriting. The mean and standard deviation values of the filter responses are collected in matrix and the Fourier transform of the matrix is used as a feature. Classification is carried out using a feed forward neural network. The proposed technique evaluated on a subset of the QUWI database realized promising results under different experimental settings.
There are many video images where hand written text may appear. Therefore handwritten scene text detection in video is essential and useful for many applications for efficient indexing, retrieval etc. Also there are m...
详细信息
There are many video images where hand written text may appear. Therefore handwritten scene text detection in video is essential and useful for many applications for efficient indexing, retrieval etc. Also there are many video frames where text line may be multi-oriented in nature. To the best of our knowledge there is no work on handwritten text detection in video, which is multi-oriented in nature. In this paper, we present a new method based on maximum color difference and boundary growing method for detection of multi-oriented handwritten scene text in video. The method computes maximum color difference for the average of R, G and B channels of the original frame to enhance the text information. The output of maximum color difference is fed to a K-means algorithm with K = 2 to separate text and non-text clusters. Text candidates are obtained by intersecting the text cluster with the Sobel output of the original frame. To tackle the fundamental problem of different orientations and skews of handwritten text, boundary growing method based on a nearest neighbor concept is employed. We evaluate the proposed method by testing on our own handwritten text database and publicly available video data (Hua's data). Experimental results obtained from the proposed method are promising.
Embedding data into vector spaces is a very popular strategy of patternrecognition methods. When distances between embeddings are quantized, performance metrics become ambiguous. In this paper, we present an analysis...
详细信息
In graphical documents (map, engineering drawing), artistic documents etc. there exist many printed materials where text lines are not parallel to each other and they are multi-oriented and curve in nature. For the OC...
详细信息
In graphical documents (map, engineering drawing), artistic documents etc. there exist many printed materials where text lines are not parallel to each other and they are multi-oriented and curve in nature. For the OCR of such documents we need to extract individual text lines from the documents. Extraction of individual text lines from multi-oriented and/or curved text document is a difficult problem. In this paper, we propose a novel method to extract individual text lines from such document pages and the method is based on the foreground and background information of the characters of the text. To take care of background information, water reservoir concept is used here. In the proposed scheme at first, individual components are detected and grouped into 3-character clusters using their inter-component distance, size and positional information. Applying concept of graph, initial 3-character clusters are merged to have larger cluster group. Using inter-character background information, orientations of the extreme characters of a larger cluster are decided and based on these orientation, two candidate regions are formed from the cluster. Finally, with the help of these candidate regions, individual lines are extracted. From the experiment, we obtained encouraging result.
Face presentation attack detection, also termed Face Anti-Spoofing (FAS) [item 1), 2) in the Appendix), is a hot and challenging research topic that has received much attention from the computervision and pattern rec...
详细信息
In this paper, we present a scheme towards recognition of English character in multi-scale and multi-oriented environments. Graphical document such as map consists of text lines which appear in different orientation. ...
详细信息
ISBN:
(纸本)9781424421749
In this paper, we present a scheme towards recognition of English character in multi-scale and multi-oriented environments. Graphical document such as map consists of text lines which appear in different orientation. Sometimes, characters in a single word may follow a curvilinear way to annotate the graphical curve lines. For recognition of such multi-scale and multi-oriented characters a Support Vector Machine (SVM) based scheme is presented in this paper. The feature used here is invariant to character orientation. Circular ring and convex hull have been used along with angular information of the contour pixels of the character to make the feature rotation invariant. We tested our proposed scheme on two different datasets. Combining circular and convex hull feature we have obtained 96.73% and 99.56% accuracy in these two datasets.
This paper explores the utilization of product graph for spotting symbols on graphical documents. Product graph is intended to find the candidate subgraphs or components in the input graph containing the paths similar...
详细信息
ISBN:
(纸本)9781467322164
This paper explores the utilization of product graph for spotting symbols on graphical documents. Product graph is intended to find the candidate subgraphs or components in the input graph containing the paths similar to the query graph. The acute angle between two edges and their length ratio are considered as the node labels. In a second step, each of the candidate subgraphs in the input graph is assigned with a distance measure computed by a random walk kernel. Actually it is the minimum of the distances of the component to all the components of the model graph. This distance measure is then used to eliminate dissimilar components. The remaining neighboring components are grouped and the grouped zone is considered as a retrieval zone of a symbol similar to the queried one. The entire method works online, i.e., it doesn't need any preprocessing step. The present paper reports the initial results of the method, which are very encouraging.
In this paper, a novel approach for content based image retrieval (CBIR) in diabetic retinopathy (DR) is proposed. The concept of salient point selection and inter-plane relationship technique is used. Salient points ...
详细信息
In this paper, a novel approach for content based image retrieval (CBIR) in diabetic retinopathy (DR) is proposed. The concept of salient point selection and inter-plane relationship technique is used. Salient points are selected from edgy image and later using inter-planer relationship, Local Binary patterns (LBPs) are calculated using the salient point as a center pixel. Our approach enhanced the results as we used color features in combination with LBP features. Experimentation is carried out on MESSIDOR database of 1200 retinal images, proposed approach has average precision of 57.82% as compared to the earlier approach whose average precision is 53.70%.
暂无评论