检索结果-内蒙古大学图书馆

Word–wise script identification from indian documents

6th IAPR International Workshop on Document Analysis Systems, DAS 2004

作者： Sinha, Suranjit Pal, Umapada Chaudhuri, B.B. Computer Vision and Pattern Recognition Unit Indian Statistical Unit 203 B.T. Road Kolkata700 108 India

ISBN: (纸本)3540230602

In a country like India, a single text line of most of the official documents contains two different script words. Under two-language formula, the Indian documents are written in English and the state official language. For Optical Character Recognition (OCR) of such a document page, it is necessary to separate different script words before feeding them to the OCRs of individual scripts. In this paper a robust technique is proposed to extract word-wise script identification from Indian doublet form documents. Here, at first, the document is segmented into lines and then the lines are segmented into words. Using different topological and structural features (like number of loops, headline feature, water reservoir concept based features, profile features, etc.) individual script words are identified from the documents. The proposed scheme is tested on 24210 words of different doublets and we received more than 97% accuracy, on average. © Springer-Verlag Berlin Heidelberg 2004.

关键词： Optical character recognition

来源：评论

学校读者我要写书评

暂无评论

Handwriting segmentation of unconstrained Oriya text

Handwriting segmentation of unconstrained Oriya text

引用

International Workshop on Frontiers in Handwriting Recognition

作者： N. Tripathy U. Pal Computer Vision and Pattern Recognition Unit Kolkata India

Segmentation of handwritten text into lines, words and characters is one of the important steps in the handwritten recognition system. For the segmentation of unconstrained Oriya handwritten text into individual characters, a water reservoir-concept based scheme is proposed in this paper. Here, at first, the text image is segmented into lines, and then lines are segmented into individual words, and words are segmented into individual characters. For line segmentation the document is divided into vertical stripes. Analyzing the heights of the water reservoirs obtained from different components of the document, the width of a stripe is calculated. Stripe-wise horizontal histograms are then computed and the relationship of the peak-valley points of the histograms is used for line segment. Based on vertical projection profile and structural features of Oriya characters, text lines are segmented into words. For character segmentation, at first, isolated and connected (touching) characters in a word are detected. Using structural, topological and water-reservoir-concept based features touching characters of the word are then segmented.

关键词： Image segmentation Histograms Handwriting recognition pattern recognition Character recognition Text recognition Shape computer vision Water resources Reservoirs

来源：评论

学校读者我要写书评

暂无评论

Shape code based word-image matching for retrieval of Indian multi-lingual documents

Shape code based word-image matching for retrieval of Indian...

引用

International Conference on pattern Recognition

作者： Tarafdar, Arundhati Mondal, Ranju Pal, Srikanta Pal, Umapada Kimura, Fumitaka Computer Vision and Pattern Recognition Unit Indian Statistical Institute Kolkata-700108 India Graduate School of Engineering Mie University Kurimamachiya-cho Tsu Japan

ISBN: (纸本)9780769541099

In the current scenario retrieving information from document images is a challenging problem. In this paper we propose a shape code based word-image matching (words-potting) technique for retrieval of multilingual documents written in Indian languages. Here, each query word image to be searched is represented by a primitive shape code using (i) zonal information of extreme points (ii) vertical shape based feature (iii) crossing count (with respect to vertical bar position) (iv) loop shape and position (v) background information etc. Each candidate word (a word having similar aspect ratio and topological feature to the query word) of the document is also coded accordingly. Then, an inexact string matching technique is used to measure the similarity between the primitive codes generated from the query word image and each candidate word of the document with which the query image is to be searched. Based on the similarity score, we retrieve the document where the query image is found. Experimental results on Bangla, Devnagari and Gurumukhi scripts document image databases confirm the feasibility and efficiency of our proposed approach. © 2010 IEEE.

关键词： Image matching

来源：评论

学校读者我要写书评

暂无评论

Text independent writer identification for Bengali script

Text independent writer identification for Bengali script

引用

International Conference on pattern Recognition

作者： Chanda, Sukalpa Franke, Katrin Pal, Umapada Wakabayashi, Tetsushi Department of Computer Science and Media Technology Gjøvik University College Norway Computer Vision and Pattern Recognition Unit Indian Statistical Institute India Graduate School of Engineering Mie University Japan

ISBN: (纸本)9780769541099

Automatic identification of an individual based on his/her handwriting characteristics is an important forensic tool. In a computational forensic scenario, presence of huge amount of text/information in a questioned document cannot be always ensured. Also, compromising in terms of systems reliability under such situation is not desirable. We here propose a system to encounter such adverse situation in the context of Bengali script. Experiments with discrete directional feature and gradient feature are reported here, along with Support Vector Machine (SVM) as classifier. We got promising results of 95.19% writer identification accuracy at first top choice and 99.03% when considering first three top choices. © 2010 IEEE.

关键词： Support vector machines

来源：评论

学校读者我要写书评

暂无评论

STRNet:Triple-stream Spatiotemporal Relation Network for Action Recognition

引用

International Journal of Automation and computing 2021年第5期18卷 718-730页

作者： Zhi-Wei Xu Xiao-Jun Wu Josef Kittler School of Artificial Intelligence and Computer Science Jiangnan UniversityWuxi 214122China Jiangsu Provincial Engineering Laboratory of Pattern Recognition and Computational Intelligence Wuxi 214122China Centre for Vision Speech and Signal ProcessingUniversity of SurreyGuildfordGU27XHUK

Learning comprehensive spatiotemporal features is crucial for human action recognition. Existing methods tend to model the spatiotemporal feature blocks in an integrate-separate-integrate form, such as appearance-and-relation network(ARTNet) and spatiotemporal and motion network(STM). However, with blocks stacking up, the rear part of the network has poor interpretability. To avoid this problem, we propose a novel architecture called spatial temporal relation network(STRNet), which can learn explicit information of appearance, motion and especially the temporal relation information. Specifically, our STRNet is constructed by three branches,which separates the features into 1) appearance pathway, to obtain spatial semantics, 2) motion pathway, to reinforce the spatiotemporal feature representation, and 3) relation pathway, to focus on capturing temporal relation details of successive frames and to explore long-term representation dependency. In addition, our STRNet does not just simply merge the multi-branch information, but we apply a flexible and effective strategy to fuse the complementary information from multiple pathways. We evaluate our network on four major action recognition benchmarks: Kinetics-400, UCF-101, HMDB-51, and Something-Something v1, demonstrating that the performance of our STRNet achieves the state-of-the-art result on the UCF-101 and HMDB-51 datasets, as well as a comparable accuracy with the state-of-the-art method on Something-Something v1 and Kinetics-400.

关键词： Action recognition spatiotemporal relation multi-branch fusion long-term representation video classification

来源：评论

学校读者我要写书评

暂无评论

Convex hull based approach for multi-oriented character recognition from graphical documents

Convex hull based approach for multi-oriented character reco...

引用

作者： Roy, Partha Pratim Pal, Umapada Lladós, Josep Kimura, Fumitaka Computer Vision Center Universitat Autònoma De Barcelona 08193 Bellaterra Spain Computer Vision and Pattern Recognition Unit Indian Statistical Institute Kolkata - 108 India Graduate School of Engineering Mie University 1577 Kurimamachiya Mie 514-8504 Japan

ISBN: (纸本)9781424421756

In this paper, we present a scheme towards recognition of English character in multi-scale and multi-oriented environments. Graphical document such as map consists of text lines which appear in different orientation. Sometimes, characters in a single word may follow a curvi-linear way to annotate the graphical curve lines. For recognition of such multi-scale and multi-oriented characters a Support Vector Machine (SVM) based scheme is presented in this paper. The feature used here is invariant to character orientation. Circular ring and convex hull have been used along with angular information of the contour pixels of the character to make the feature rotation invariant. We tested our proposed scheme on two different datasets. Combining circular and convex hull feature we have obtained 96.73% and 99.56% accuracy in these two datasets. © 2008 IEEE.

关键词： Support vector machines

来源：评论

学校读者我要写书评

暂无评论

Lexicon reduction technique for Bangla handwritten word recognition

Lexicon reduction technique for Bangla handwritten word reco...

引用

10th IAPR International Workshop on Document Analysis Systems, DAS 2012

作者： Bhowmik, Tapan Kumar Roy, Utpal Parui, Swapan K. Faculty of Mathematics and Natural Sciences University of Groningen Netherlands Department of Computer and System Sciences Visva Bharati Santiniketan India Computer Vision and Pattern Recognition Unit Indian Statistical Institute Kolkata India

ISBN: (纸本)9780769546612

In this paper we introduce a stroke based lexicon reduction technique in order to reduce the search space for recognition of handwritten words. The principle of this technique involves mainly two aspects of a word image to constitute a feature vector: one is word-length and the other is shape of the word. The length of the word image is represented by the number of specific vertical strokes present in the word image and, on the other hand, the shape of a word image is realized with the combination of both horizontal and vertical strokes. The experiment has been carried out with a database of 35,700 off-line handwritten Bangla word images. Though our proposed lexicon reduction technique is developed for recognition of Bangla handwritten words, its generalization property can easily be exploited for recognition of handwriting in other scripts also. © 2012 IEEE.

关键词： Character recognition

来源：评论

学校读者我要写书评

暂无评论

Multi-gradient-direction based deep learning model for arecanut disease identification

引用

CAAI Transactions on Intelligence Technology 2022年第2期7卷 156-166页

作者： S.B.Mallikarjuna Palaiahnakote Shivakumara Vijeta Khare M.Basavanna Umapada Pal B.Poornima Department of Computer Science and Engineering Bapuji Institute of Engineering and TechnologyDavanagereAffiliated to Visvesvaraya Theological UniversityBelagaviKarnatakaIndia Faculty of Computer Science and Information Technology University of MalayaKuala LumpurMalaysia Adani Institute of Infrastructure Engineering AhmedabadIndia Department of Computer Science Davanagere UniversityDavanagereKarnatakaIndia Computer Vision and Pattern Recognition Unit Indian Statistical InstituteKolkataWest BengalIndia

Arecanut disease identification is a challenging problem in the field of image *** this work,we present a new combination of multi-gradient-direction and deep con-volutional neural networks for arecanut disease identification,namely,rot,split and *** to the effect of the disease,there are chances of losing vital details in the *** enhance the fine details in the images affected by diseases,we explore multi-Sobel directional masks for convolving with the input image,which results in enhanced *** proposed method extracts arecanut as foreground from the enhanced images using Otsu ***,the features are extracted for foreground information for disease identification by exploring the ResNet *** advantage of the proposed approach is that it identifies the diseased images from the healthy arecanut *** results on the dataset of four classes(healthy,rot,split and rot-split)show that the proposed model is superior in terms of classification rate.

关键词： deep learning image analysis pattern recognition

来源：评论

学校读者我要写书评

暂无评论

BYOLMED3D: SELF-SUPERVISED REPRESENTATION LEARNING OF MEDICAL VIDEOS USING GRADIENT ACCUMULATION ASSISTED 3D BYOL FRAMEWORK

arXiv

引用

arXiv 2022年

作者： Manna, Siladittya Dey, Rakesh Chakraborty, Souvik Computer Vision and Pattern Recognition Unit Indian Statistical Institue Kolkata India Computer Vision Researcher

Applications on Medical Image Analysis suffer from acute shortage of large volume of data properly annotated by medical experts. Supervised Learning algorithms require a large volumes of balanced data to learn robust representations. Often supervised learning algorithms require various techniques to deal with imbalanced data. Self-supervised learning algorithms on the other hand are robust to imbalance in the data and are capable of learning robust representations. In this work, we train a 3D BYOL self-supervised model using gradient accumulation technique to deal with the large number of samples in a batch generally required in a self-supervised algorithm. To the best of our knowledge, this work is one of the first of its kind in this domain. We compare the results obtained through our experiments in the downstream task of ACL Tear Injury detection with the contemporary self-supervised pre-training methods and also with ResNet3D-18 initialized with the Kinetics-400 pre-trained weights. From the downstream task experiments, it is evident that the proposed framework outperforms the existing baselines. © 2022, CC0.

关键词： Supervised learning

来源：评论

学校读者我要写书评

暂无评论

Visually-driven semantic augmentation for zero-shot learning 29

Visually-driven semantic augmentation for zero-shot learning

引用

29th British Machine vision Conference, BMVC 2018

作者： Roy, Abhinaba Cavazza, Jacopo Murino, Vittorio Pattern Analysis and Computer Vision Istituto Italiano di Tecnologia Genova Italy Department of Naval Electrical Electronic and Telecommunications Engineering University of Genova Italy Department of Computer Science University of Verona Italy

In this paper, we tackle the zero-shot learning (ZSL) classification problem and analyse one of its key ingredients, the semantic embedding. Despite their fundamental role, semantic embeddings are not learnt from the visual data to be classified, but, instead, they either come from manual annotation (attributes) or from a linguistic text corpus (distributed word embeddings, DWEs). Hence, there is no guarantee that visual and semantic information could fit well, and as to bridge this gap, we propose to augment the semantic information of attributes/DWEs with semantic representations directly extracted from visual data by means of soft labels. When combined in a novel ZSL paradigm based on latent attributes, our approach achieves favourable performances on three public benchmark datasets. © 2018. The copyright of this document resides with its authors.

关键词： Zero-shot learning

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：