检索结果-内蒙古大学图书馆

International Conference on Document Analysis and recognition

作者： S. Thadchanamoorthy N.D. Kodikara H.L. Premaretne Umapada Pal Fumitaka Kimura Eastern University Sri Lanka School of Computing University of Colombo Sri Lanka Computer Vision and Pattern Recognition unit Indian Statistical Institute Kolkata India Graduate School of Engineering Mie University Japan

ISBN: (纸本)9781479901937

Although there are some reports on offline Tamil isolated handwritten character recognition, to our knowledge there is only two reports on Tamil off-line handwritten word recognition. Also no city name dataset is available for Tamil script. In this paper we present a Tamil offline city name dataset, we developed, and propose a scheme for recognition. Because of the different writing style of various individuals, some of the characters in a Tamil city name may touch and accurate segmentation of such touching into individual characters is a difficult task. Avoiding proper segmentation here, we consider a city name string as a word and the recognition problem is treated as lexicon driven word recognition. In the proposed method, binarized city names are pre-segmented into primitives (individual character or its parts). Primitive components of each city name are then merged into possible characters to get the best city name using dynamic programming. For merging, total likelihood of characters is used as the objective function and character likelihood is computed based on Modified Quadratic Discriminant Function (MQDF), where direction features are applied. A dataset of 265 Tamil city names is developed. and the database will be available freely to the researchers. From the experiment of the proposed scheme 96.89% city name accuracy is obtained from this dataset.

关键词： Cities and towns Handwriting recognition Accuracy Image segmentation Dynamic programming Feature extraction Cavity resonators

来源：评论

学校读者我要写书评

暂无评论

A method to generate synthetically warped document image

arXiv

引用

arXiv 2019年

作者： Garai, Arpan Biswas, Samit Mandal, Sekhar Chaudhuri, Bidyut B. Department of Computer Science and Technology Indian Institute of Engineering Sciences and Technology Shibpur Hawrah West Bengal711103 India Techno India University Kolkata 3 Computer Vision and Pattern Recognition Unit Indian Statistical Institute Kolkata India

The digital camera captured document images may often be warped and distorted due to different camera angles or document surfaces. A robust technique is needed to solve this kind of distortion. The research on dewarping of the document suffers due to the limited availability of benchmark public dataset. In recent times, deep learning based approaches are used to solve the problems accurately. To train most of the deep neural networks a large number of document images is required and generating such a large volume of document images manually is difficult. In this paper, we propose a technique to generate a synthetic warped image from a flat-bedded scanned document image. It is done by calculating warping factors for each pixel position using two warping position parameters (WPP) and eight warping control parameters (WCP). These parameters can be specified as needed depending upon the desired warping. The results are compared with similar real captured images both qualitative and quantitative way. Copyright © 2019, The Authors. All rights reserved.

关键词： Deep neural networks

来源：评论

学校读者我要写书评

暂无评论

New texture-spatial features for keyword spotting in video images

New texture-spatial features for keyword spotting in video i...

引用

Asian Conference on pattern recognition (ACPR)

作者： Palaiahnakote Shivakumara Guozhu Liang Sangheeta Roy Umapada Pal Tong Lu Faculty of Computer Science and Information Technology University of Malaya Kuala Lumpur Malaysia National Key Lab for Novel Software Technology Nanjing University Nanjing China Computer Vision and Pattern Recognition Unit Indian Statistical Institute Kolkata India

ISBN: (纸本)9781479961016

Keyword spotting in video document images is challenging due to low resolution and complex background of video images. We propose the combination of Texture-Spatial-Features (TSF) for keyword spotting in video images without recognizing them. First, a segmentation method extracts words from text lines in each video image. Then we propose the set of texture features for identifying text candidates in the word image with the help of k-means clustering. The proposed method finds proximity between text candidates to study the spatial arrangement of pixels that result in feature vectors for spotting words in the input frame. The proposed method is evaluated on word images of different fonts, contrasts, backgrounds and font sizes, which are chosen from standard databases such as ICDAR 2013 video and our video data. Experimental results show that the proposed method outperforms the existing method in terms of recall, precision and f-measure.

关键词： Image segmentation Semantics Video signal processing Indexing pattern recognition Spatial resolution

来源：评论

学校读者我要写书评

暂无评论

A New U-Net Based System for Multi-Cultural Wedding Image Classification

SSRN

引用

SSRN 2023年

作者： Shivakumara, Palaiahnakote Kumar, C. Pavan Nemade, Jagrut J. Michael, Kshitiz Kumar, Akash Anami, Basavaraj S. Pal, Umapada Faculty of Computer Science and Information Technology University of Malaya Kuala Lumpur Malaysia Indian Institute of Information Technology Dharwad India KLE Institute of Technology India Computer Vision and Pattern Recognition Unit Indian Statistical Institute Kolkata India

Use of social media for communication, sharing expressing views, broadcasting news, threatening and blackmailing has become an integral part of society. One such activity is understanding multi-cultural wedding images uploaded on social media. This paper presents a novel method based on the combination of U-Net, Convolutional Neural Network and Random Forest for classification of multicultural wedding images. In the case of wedding images, bride and bridegroom draw the attention of the viewers. This observation led to propose U-Net for segmenting the region of bride and bridegroom in a novel way. Similarly, it is noted that the costumes of bride and bridegroom are vital information for differentiating different cultures. This cue motivated us to extract features using CNN for classification. Since the extracted features using CNN are capable of discriminating images of different classes, we propose a simple and effective Random-Forest for Multicultural Wedding Image Classification. The efficiency of the proposed model is demonstrated by testing it on our own dataset of six multi-cultural wedding classes and standard dataset of wedding and non-wedding images classes. Experimental results on both the datasets show that the proposed model outperforms the state-of-the-art models in terms of average classification rate © 2023, The Authors. All rights reserved.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

Tldsmi: Genetic Algorithm Based Network for Text Localization in Distorted Social Media Images

SSRN

引用

SSRN 2023年

作者： Palaiahnakote, Shivakumara Kumar, C. Pavan Aggarwal, Pranjal Sharma, Shubham Chandana, Pasupuleti Basavanna, M. Pal, Umapada Faculty of Computer Science and Information Technology University of Malaya Kuala Lumpur50603 Malaysia Indian Institute of Information Technology Dharwad India University of Davanagere Karnataka India Computer Vision and Pattern Recognition Unit Indian Statistical Institute Kolkata India

This paper presents a novel model for understanding social image content through text localization. For text localization, we explore Maximally Stable Extremal Regions (MSER) for detecting components, that works by clustering pixels having similar properties. The output of component detection includes several non-text components due to degradations of social media images. To select the best components among many, we explore Genetic Algorithm by convolving different kernels with components, which results in a feature matrix which is further fed to EfficientNet for choosing actual text components. Therefore, the proposed model is called Genetic Algorithm based Network for Text Localization in Distorted Social Media Images (TLDSMI). For evaluating text localization, we consider the images of standard dataset of natural scene by uploading and downloading from different social media platforms, namely, WhatsApp, Telegram and Instagram. The effectiveness of our method is shown by testing on original and distorted standard datasets. © 2023, The Authors. All rights reserved.

关键词： Genetic algorithms

来源：评论

学校读者我要写书评

暂无评论

ICDAR 2013 Handwriting Segmentation Contest

ICDAR 2013 Handwriting Segmentation Contest

引用

International Conference on Document Analysis and recognition

作者： Nikolaos Stamatopoulos Basilis Gatos Georgios Louloudis Umapada Pal Alireza Alaei Computational Intelligence Laboratory Institute of Informatics and Telecommunications National Center for Scientific Research Demokritos Athens Greece Computer Vision and Pattern Recognition Unit Indian Statistical Institute Kolkata India Computer Science Laboratory Universite Francois Rabelais Tours France

ISBN: (纸本)9781479901937

This paper presents the results of the Handwriting Segmentation Contest that was organized in the context of the ICDAR2013. The general objective of the contest was to use well established evaluation practices and procedures to record recent advances in off-line handwriting segmentation. Two benchmarking datasets, one for text line and one for word segmentation, were created in order to test and compare all submitted algorithms as well as some state-of-the-art methods for handwritten document image segmentation in realistic circumstances. Handwritten document images were produced by many writers in two Latin based languages (English and Greek) and in one Indian language (Bangla, the second most popular language in India). These images were manually annotated in order to produce the ground truth which corresponds to the correct text line and word segmentation results. The datasets of previously organized contests (ICDAR2007, ICDAR2009 and ICFHR2010 Handwriting Segmentation Contests) along with a dataset of Bangla document images were used as training dataset. Eleven methods are submitted in this competition. A brief description of the submitted algorithms, the evaluation criteria and the segmentation results obtained from the submitted methods are also provided in this manuscript.

关键词： Image segmentation Educational institutions Handwriting recognition Benchmark testing Measurement Matched filters Text analysis

来源：评论

学校读者我要写书评

暂无评论

ICDAR2019 robust reading challenge on multi-lingual scene text detection and recognition-RRC-MLT-2019 15

ICDAR2019 robust reading challenge on multi-lingual scene te...

引用

15th IAPR International Conference on Document Analysis and recognition, ICDAR 2019

作者： Nayef, Nibal Liu, Cheng-Lin Ogier, Jean-Marc Patel, Yash Busta, Michal Chowdhury, Pinaki Nath Karatzas, Dimosthenis Khlif, Wafa Matas, Jiri Pal, Umapada Burie, Jean-Christophe L3i Laboratory University of la Rochelle France Computer Vision Center Universitat Autonoma de Barcelona Spain CVPR Unit Indian Statistical Institute India Robotics Institute Carnegie Mellon Universiry Pittsburgh United States Center for Machine Perception Department of Cybernetics Czech Technical University Prague Czech Republic National Laboratory of Pattern Recognition Institute of Automation Chinese Academy of Sciences China

ISBN: (纸本)9781728128610

With the growing cosmopolitan culture of modern cities, the need of robust Multi-Lingual scene Text (MLT) detection and recognition systems has never been more immense. With the goal to systematically benchmark and push the state-of-the-art forward, the proposed competition builds on top of the RRC-MLT-2017 with an additional end-to-end task, an additional language in the real images dataset, a large scale multi-lingual synthetic dataset to assist the training, and a baseline End-to-End recognition method. The real dataset consists of 20,000 images containing text from 10 languages. The challenge has 4 tasks covering various aspects of multi-lingual scene text: (a) text detection, (b) cropped word script classification, (c) joint text detection and script classification and (d) end-to-end detection and recognition. In total, the competition received 60 submissions from the research and industrial communities. This paper presents the dataset, the tasks and the findings of the presented RRC-MLT-2019 challenge. © 2019 IEEE.

关键词： Competition

来源：评论

学校读者我要写书评

暂无评论

ICDAR 2023 Video Text Reading Competition for Dense and Small Text

arXiv

引用

arXiv 2023年

作者： Wu, Weijia Zhao, Yuzhong Li, Zhuang Li, Jiahong Shou, Mike Zheng Pal, Umapada Karatzas, Dimosthenis Bai, Xiang Zhejiang University China University of Chinese Academy of Sciences China Kuaishou Technology China National University of Singapore Singapore Computer Vision and Pattern Recognition Unit Indian Statistical Institute India Computer Vision Centre Universitat Autónoma de Barcelona Spain Huazhong University of Science and Technology China

Recently, video text detection, tracking and recognition in natural scenes are becoming very popular in the computer vision community. However, most existing algorithms and benchmarks focus on common text cases (e.g., normal size, density) and single scenario, while ignore extreme video texts challenges, i.e., dense and small text in various scenarios. In this competition report, we establish a video text reading benchmark, named DSText, which focuses on dense and small text reading challenge in the video with various scenarios. Compared with the previous datasets, the proposed dataset mainly include three new challenges: 1) Dense video texts, new challenge for video text spotter. 2) High-proportioned small texts. 3) Various new scenarios, e.g., ‘Game’, ‘Sports’, etc. The proposed DSText includes 100 video clips from 12 open scenarios, supporting two tasks (i.e., video text tracking (Task 1) and end-to-end video text spotting (Task2)). During the competition period (opened on 15th February, 2023 and closed on 20th March, 2023), a total of 24 teams participated in the three proposed tasks with around 30 valid submissions, respectively. In this article, we describe detailed statistical information of the dataset, tasks, evaluation protocols and the results summaries of the ICDAR 2023 on DSText competition. Moreover, we hope the benchmark will promise the video text research in the community. © 2023, CC BY.

关键词： Character recognition

来源：评论

学校读者我要写书评

暂无评论

Automatic Handwritten Indian Scripts Identification

Automatic Handwritten Indian Scripts Identification

引用

International Workshop on Frontiers in Handwriting recognition

作者： Rajmohan Pardeshi B.B. Chaudhuri Mallikarjun Hangarge K.C. Santosh Department of Computer Science Karnatak Arts Science & Commerce College Bidar INDIA Computer Vision and Pattern Recognition Unit Indian Statistical Institute Kolkata INDIA US National Library of Medicine (NLM) National Institutes of Health (NIH) Bethesda MD USA

Since OCR engines are usually script-dependent, automatic text recognition in multi-script document requires a pre-processor module that identifies the scripts. Based on this motivation, in this paper, we present a word level handwritten Indian script identification technique. To handle this, words are first segmented by morphological dilation and performed connected component labelling. We then employ the Radon transform, discrete wavelet transform, statistical filters and discrete cosine transform to extract the directional multi-resolution spatial features. We tested the features by using linear discriminant analysis, support vector machine and K-nearest neighbour classifiers over 11 different major Indian scripts (including Roman) in bi-script and tri-script scenario. In our tests, we have achieved maximum accuracies of 98% and 96% for bi-script and tri-scipt respectively.

关键词： Discrete cosine transforms Discrete wavelet transforms Support vector machines Accuracy Feature extraction Kernel

来源：评论

学校读者我要写书评

暂无评论

Sclera Segmentation Benchmarking Competition in Cross-resolution Environment

Sclera Segmentation Benchmarking Competition in Cross-resolu...

引用

IAPR International Conference on Biometrics (ICB)

作者： Abhijit Das Umapada Pal Michael Blumenstein Caiyong Wang Yong He Yuhao Zhu Zhenan Sun Inria Sophia Antipolis – Méditerranée France Computer Vision and Pattern Recognition Unit Indian Statistical Institute Kolkata India School of Software University of Technology Sydney Australia Institute of Automation Chinese Academy of Sciences(CASIA)

This paper summarizes the results of the Sclera Segmentation Benchmarking Competition (SSBC 2019). It was organized in the context of the 12th IAPR International Conference on Biometrics (ICB 2019). The aim of this competition was to record the developments on sclera segmentation in the cross-resolution environment (sclera trait captured using multiple acquiring sensors with different image resolutions). Additionally, the competition also aimed to gain the attention of researchers on this subject of research. For the purpose of benchmarking, we have employed two datasets of sclera images captured using different sensors. The first dataset was collected using a DSLR camera and the second one was collected using a mobile phone camera. The first dataset is the Multi-Angle Sclera Dataset (MASD version 1). The second dataset is the Mobile Sclera Dataset (MSD), and in this dataset, images were captured using .a mobile phone rear camera of 8-megapixels. Baseline manual segmentation masks of the sclera images from both the datasets were developed. Precision and recall-based measures were employed to evaluate the effectiveness and ranking of the submitted segmentation techniques. Four algorithms were submitted to address the segmentation task. In this paper we analyzed the results produced by these algorithms/systems, and we have defined a way forward for this problem. Both the datasets along with some of the accompanying ground truth/baseline masks will be freely available for research purposes.

关键词：

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：