检索结果-内蒙古大学图书馆

International Conference on Document Analysis and recognition

作者： U. Pal B.B. Chaudhuri Computer Vision and Pattern Recognition Unit Indian Statistical Institute Calcutta India

In a general situation, a document page may contain several scriptforms. For optical character recognition (OCR) of such a document page, it is necessary to separate the scripts before feeding them to their individual OCR systems. An automatic technique for the identification of printed Roman, Chinese, Arabic, Devnagari and Bangla text lines from a single document is proposed. Shape based features, statistical features and some features obtained from the concept of a water reservoir are used for script identification. The proposed scheme has an accuracy of about 97.33%.

关键词： Water resources Reservoirs Optical character recognition software Shape Water storage Probability computer vision pattern recognition Optical devices Fractals

来源：评论

学校读者我要写书评

暂无评论

An approach for stemming in symbolically compressed indian language imaged documents

An approach for stemming in symbolically compressed Indian l...

引用

International Conference on Document Analysis and recognition

作者： U. Garain A.K. Datta Computer Vision and Pattern Recognition Unit Indian Statistical Institute Kolkata India

Stemming is used in many information retrieval (IR) systems to reduce variant word forms to common roots, and thereby improving the overall retrieval efficiency. This paper presents an algorithm for stemming in the context of document image retrieval system. The algorithm assumes that the documents are symbolically compressed and stemming has been attempted in the compressed domain itself. Experiments have been conducted on indian language imaged documents for which efficient OCR still remains a challenging task. Results obtained from a set 150 document images (in Bangla script, the second most popular script in the indian sub-continent) consisting of about 12K word show a promising performance of the proposed approach.

关键词： Image coding Image retrieval Optical character recognition software Information retrieval Internet Image storage computer vision pattern recognition Search engines Character recognition

来源：评论

学校读者我要写书评

暂无评论

An MLP-based texture segmentation technique which does not require a feature set

An MLP-based texture segmentation technique which does not r...

引用

International Conference on pattern recognition

作者： U. Bhattacharya B.B. Chaudhuri S.K. Parui Computer Vision & Pattern Recognition Unit Indian Statistical Institute Calcutta India

In this paper we describe a texture segmentation approach without feature computation based on a multilayer perceptron network (MLP). Thus, the users need not bother about the selection and then computation of feature set and hence real-time segmentation may be possible. The basic motivation of the work is the fact that human vision does not consciously compute features for distinguishing different textures in a scene. A single hidden layer MLP network has been found to be most suitable with heuristically chosen input and hidden layer sizes. A method has been used to speedup the learning of the MLP network. The result of segmentation by a trained network usually results in misclassification in the form of speckles. For the removal of such noise an edge-preserving-noise-smoothing technique is proposed. The final segmentation accuracy is well comparable with that of other existing techniques.

关键词： computer vision Image segmentation Humans Layout pattern recognition Electronic mail computer networks Surface texture User-generated content Artificial neural networks

来源：评论

学校读者我要写书评

暂无评论

Composite Script Identification and Orientation Detection for indian Text Images

Composite Script Identification and Orientation Detection fo...

引用

International Conference on Document Analysis and recognition

作者： Shamita Ghosh Bidyut B. Chaudhuri Computer Vision and Pattern Recognition Unit Indian Statistical Institute Kolkata India

A major preprocessing step in a multi-script OCR is to identify the script type of the test document image. The published papers on script identification usually assume that the test image is in correct i.e. 0° orientation. But by mistake a document may be fed to the system in wrong orientation, say at an angle of nearly 180° or ±90°. In this method we propose a script identification method that works for unknown orientation for all 11 official indian scripts. Here, we first find the skew and counter-rotate the document by the skew angle. This will lead to correct (0°) or upside down (180°) orientation. Then script identification is done by a multi-stage tree classifier using features invariant to 0°/180° orientation. Next we go to find the orientation of the image by a two class classifier for each script. Performance of the proposed method has been tested on a variety of documents and promising results have been obtained.

关键词： Reservoirs Support vector machines Accuracy Feature extraction pattern recognition Kernel

来源：评论

学校读者我要写书评

暂无评论

Writer Identification from offline isolated Bangla characters and numerals

Writer Identification from offline isolated Bangla character...

引用

International Conference on Document Analysis and recognition

作者： Chandranath Adak Bidyut B. Chaudhuri Computer Vision and Pattern Recognition Unit Indian Statistical Institute Kolkata India

Writer identification is an essential component in computational forensic. In this paper, we attempt to do this job based only on isolated characters and numerals. For that, at first, some points of interest (keypoints) on the image are detected by structural analysis and SIFT based detector. Then we calculate a set of features within a certain neighborhood of the keypoint and employ fusion rule on multiple probabilistic SVM classifiers output for writer identification. For experimental analysis, a database containing 212,300 isolated Bangla orthosyllabic characters and numerals are generated with the help of 100 writers. We obtain fairly good result to identify a writer. We also try to find a small set of highly discriminative characters storing extra information about the writing style of an individual.

关键词： Chlorine

来源：评论

学校读者我要写书评

暂无评论

Multi-gradient-direction based deep learning model for arecanut disease identification

引用

CAAI Transactions on Intelligence Technology 2022年第2期7卷 156-166页

作者： S.B.Mallikarjuna Palaiahnakote Shivakumara Vijeta Khare M.Basavanna Umapada Pal B.Poornima Department of Computer Science and Engineering Bapuji Institute of Engineering and TechnologyDavanagereAffiliated to Visvesvaraya Theological UniversityBelagaviKarnatakaIndia Faculty of Computer Science and Information Technology University of MalayaKuala LumpurMalaysia Adani Institute of Infrastructure Engineering AhmedabadIndia Department of Computer Science Davanagere UniversityDavanagereKarnatakaIndia Computer Vision and Pattern Recognition Unit Indian Statistical InstituteKolkataWest BengalIndia

Arecanut disease identification is a challenging problem in the field of image *** this work,we present a new combination of multi-gradient-direction and deep con-volutional neural networks for arecanut disease identification,namely,rot,split and *** to the effect of the disease,there are chances of losing vital details in the *** enhance the fine details in the images affected by diseases,we explore multi-Sobel directional masks for convolving with the input image,which results in enhanced *** proposed method extracts arecanut as foreground from the enhanced images using Otsu ***,the features are extracted for foreground information for disease identification by exploring the ResNet *** advantage of the proposed approach is that it identifies the diseased images from the healthy arecanut *** results on the dataset of four classes(healthy,rot,split and rot-split)show that the proposed model is superior in terms of classification rate.

关键词： deep learning image analysis pattern recognition

来源：评论

学校读者我要写书评

暂无评论

indian language multimedia and information retrieval

Indian language multimedia and information retrieval

引用

International Conference on Computational Intelligence and Multimedia Applications

作者： B.B. Chaudhuri Computer Vision and Pattern Recognition Unit Indian Statistical Institute Calcutta India

Over the last decade or so, remarkable developments in computer technology have given a major impetus to research in the field of multimedia. With the proliferation of the Internet and the increasingly widespread use of sophisticated computers, the multimedia revolution has arrived in India as well. It is therefore time to take stock of the situation: to evaluate how existing techniques can be used in the indian context and to determine what new methods have to be developed. This paper summarizes the current state of multimedia technology in India and points to directions for further work. As more and more people in India begin to use computers and the Internet, multimedia capabilities will start playing a vital role in solving problems in many different areas. Education is probably one of the most important areas where multimedia technology can have a major impact. Already, multimedia educational systems are being developed in indian languages. Several interactive encyclopaedia-like environments are also being marketed on CD-ROMs, and cover topics ranging from indian classical music to indian history, using text, images and sound. Some of the other possible applications of multimedia technology are: the development of digital libraries, news and information dissemination services, medicine, business and commerce, and the entertainment industry. Multimedia information technology is thus poised to become an exciting area for research and development activities in India.

关键词： Information retrieval Internet computer science education Educational technology Multimedia systems Music History Software libraries Biomedical imaging Business

来源：评论

学校读者我要写书评

暂无评论

Towards indian language spell-checker design

Towards Indian language spell-checker design

引用

Language Engineering Conference

作者： B.B. Chaudhuri Computer Vision and Pattern Recognition Unit Indian Statistical Institute Kolkata India

This paper deals with the development of a spell-checker in indian languages using as an example Bangla, the second most popular language on the indian Subcontinent. A brief review of problems and the current scenario of indian language spell-checkers is described. The approach for the Bangla spell-checker is then elaborated. In this approach the technique works in two stages. The first stage takes care of phonetic similarity error. For that the phonetically similar characters are mapped into single units of character code. A new dictionary D/sub c/ is constructed with this reduced set of alphabets. A phonetically similar but wrongly spelt word can be easily corrected using this dictionary. The second stage takes care of errors other than phonetic similarity. A wrongly spelt word S of n characters is searched in the dictionary D/sub c/. If S is a nonword, its first k/sub 1//spl les/n characters will match with a valid word in D/sub c/. (if k/sub 1/=n then the word in D/sub c/ must be longer than n). A reversed word dictionary D/sub r/ is also generated where the characters of the word are maintained in a reversed order. If the last k/sub 2/ characters of S match with a word in D/sub r/ then, for a single error, it is located within the intersection region of first k/sub 1/+1 and last k/sub 2/+1 characters of S. We observed that this region is very small compared to word length for most cases and the number of suggested correct words can be drastically reduced using this information. We have used our approach in correcting Bangla text, where the problem of inflection is tackled by a simplified version of a morphological analyser. Another problem encountered in indian languages is the existence of a large number of compound words formed by euphony and assimilation. The problem of compound words is also carefully tackled.

关键词： computer errors Dictionaries Optical character recognition software Error correction Optical computing pattern recognition computer interfaces Information retrieval Speech recognition computer vision

来源：评论

学校读者我要写书评

暂无评论

Automatic separation of machine-printed and hand-written text lines

Automatic separation of machine-printed and hand-written tex...

引用

International Conference on Document Analysis and recognition

作者： U. Pal B.B. Chaudhuri Indian Statistical Institute Computer Vision and Pattern Recognition Unit Calcutta India

There are many types of documents where machine-printed and hand-written texts appear intermixed. Since the optical character recognition (OCR) methodologies for machine-printed and hand-written texts are different, it is necessary to separate these two types of text before feeding them to the respective OCR systems. In this paper, we present such a scheme for both Bangla and Devnagari characters. The scheme is based on the structural and statistical features of the machine-printed and hand-written text lines. The classification scheme has an accuracy of about 98.3%.

关键词： Natural languages Optical character recognition software Neural networks computer vision pattern recognition Statistics Handwriting recognition Image segmentation Data mining Histograms

来源：评论

学校读者我要写书评

暂无评论

Bag-of-features HMMs for segmentation-free Bangla word spotting 13

Bag-of-features HMMs for segmentation-free Bangla word spott...

引用

4th International Workshop on Multilingual OCR, MOCR 2013

作者： Rothacker, L. Fink, G.A. Banerjee, P. Bhattacharya, U. Chaudhuri, B.B. Department of Computer Science TU Dortmund University 44221 Dortmund Germany Computer Vision and Pattern Recognition Unit Indian Statistical Institute 203 B. T. Road Kolkata-700 0108 India

ISBN: (纸本)9781450321143

In this paper we present how Bag-of-Features Hidden Markov Models can be applied to printed Bangla word spotting. These statistical models allow for an easy adaption to different problem domains. This is possible due to the integration of automatically estimated visual appearance features and Hidden Markov Models for spatial sequential modeling. In our evaluation we are able to report high retrieval scores on a new printed Bangla dataset. Furthermore, we outperform state-of-the-art results on the well-known George Washington word spotting benchmark. Both results have been achieved using an almost identical parametric method configuration. © 2013 ACM.

关键词： Hidden Markov models

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：