检索结果-内蒙古大学图书馆

A novel multiseed nonhierarchical data clustering technique

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS 1997年第5期27卷 871-877页

作者： Chaudhuri, D Chaudhuri, BB Computer Vision and Pattern Recognition Unit Indian Statistical Institute Calcutta India

Clustering techniques such as K-means and Forgy as well as their improved version ISODATA group data around one seed point for each cluster. It is well known that these methods do not work well if the shape of the cluster is elongated or nonconvex. We argue that for a elongated or nonconvex shaped cluster, more than one seed is needed, In this paper a multiseed clustering algorithm is proposed. A density based representative point selection algorithm is used to choose the initial seed points. To assign several seed points to one cluster, a minimal spanning tree guided novel technique is proposed. Also, a border point detection algorithm is proposed for the detection of shape of the cluster. This border in turn signifies whether the cluster is elongated or not. Experimental results show the efficiency of this clustering technique.

关键词： classification minimal spanning tree multiseed clustering pattern recognition

来源：评论

学校读者我要写书评

暂无评论

Texture synthesis by a neural network model

引用

NEURAL COMPUTING & APPLICATIONS 1997年第1期6卷 2-11页

作者： Chaudhuri, BB Kundu, P Computer Vision and Pattern Recognition Unit Indian Statistical Institute Calcutta India

In this paper we propose a neural network model to synthesise texture images. The model is based on a continuous Hopfield-like network where each pixel of the image is occupied by a neuron that is eight-connected to its neighbours. A state of the neuron denotes a certain grey level of the corresponding pixel. The firing of the neuron changes its state, and hence the grey level of the corresponding pixel. Different two-tone and grey-tone texture images can be synthesised by manipulating the connection weights and by varying the algorithm iteration number. For grey-tone texture synthesis, a Markov chain principle has been employed to decide on the multiple state transition of a neuron. The model can be employed for texture propagation with the advantage that it allows propagation without showing any blocky effect.

关键词： computer graphics image processing texture synthesis

来源：评论

学校读者我要写书评

暂无评论

Morphological processing of Indian languages for lexical interaction with application to spelling error correction

引用

SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES 1996年第3期21卷 363-380页

作者： Sengupta, P Chaudhuri, BB Computer Vision and Pattern Recognition Unit Indian Statistical Institute Calcutta India

An NLP system for Indian languages should have a lexical subsystem that is driven by a morphological analyzer. Such an analyzer should be able to parse a word into its constituent morphemes and obtain lexical projection of the word as a unification of the projections of the constituent morphemes. Lexical projections considered here are f-structures of the Lexical Functional Grammar (LFG). A formalism has been proposed, by which the lexicon writer may specify the lexicon in four levels. The specifications are compiled into a stored lexical knowledge base on one hand and a formulation of derivational morphology called Augmented Finite State Automata (AFSA) on the other to achieve a compact lexical representation. The aspects of AFSA, especially its power of morphological parsing of words in a computationally attractive manner, has been discussed. An additional utility of the AFSA, in the form of spelling error corrector, has also been discussed. Bangla, or Bengali is considered as a case study. Implementation notes based on object-oriented programming principles has been provided.

关键词： natural language processing morphological sub-system lexical representation augmented finite state automata spelling corrector object-oriented implementation

来源：评论

学校读者我要写书评

暂无评论

A new definition of neighborhood of a point in multi-dimensional space

引用

pattern recognition LETTERS 1996年第1期17卷 11-17页

作者： Chaudhuri, BB Computer Vision and Pattern Recognition Unit Indian Statistical Institute 203 B.T. Road Calcutta 700 035 India

Given a set of points in multi-dimensional space, we propose a new definition for the neighbors of an arbitrary point P. The definition tries to capture the idea that the neighbors should be as near to P and as symmetrically placed around P as possible. In contrast, the conventional nearest neighborhood considers only nearness as the criterion for neighborhood. We propose an iterative procedure to compute the neighbors where the first neighbor is the nearest neighbor. The second and other neighbors are chosen so that at any stage the distance between the centroid of the neighbors and P is as small as possible. The centroid criterion takes care of symmetrical placement of the neighbors. One can use median instead of centroid to define the neighbors. The new definition is free from any user-specified parameter and can be used for pattern classification, clustering and low-level description of dot patterns.

关键词： neighborhood classification clustering pattern recognition image processing

来源：评论

学校读者我要写书评

暂无评论

computer recognition OF PRINTED BANGLA SCRIPT

引用

INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE 1995年第11期26卷 2107-2123页

作者： PAL, U CHAUDHURI, BB Computer Vision and Pattern Recognition Unit Indian Statistical Institute Calcutta 700 035 203 B. T. Road India

This paper considers optical character recognition (OCR) of Bangla, the second most popular script in the Indian subcontinent. A complete OCR system is described for documents of single Bangla font, where more than three hundred character shapes are recognized by a combination of template and feature-matching approach. Here the document image captured by a flatbed scanner is subject to tilt correction, line, word and character segmentation, simple and compound character separation, feature extraction and finally character recognition. Some character occurrence statistics have been computed to aid the recognition process. The simple character recognition is done by a feature-based tree classifier, and the compound character recognition involves a template matching approach preceded by a feature-based grouping. At present, recognition accuracy of about 96% is obtained by the system.

关键词： Optical character recognition

来源：评论

学校读者我要写书评

暂无评论

Extraction of type style-based meta-information from imaged documents

引用

International Journal on Document Analysis and recognition 2001年第3期3卷 138-149页

作者： Chaudhuri, B.B. Garain, U. Computer Vision and Pattern Recognition Unit Indian Statistical Institute 203 B.T. Road Calcutta 700 035 India

Extraction of some meta-information from printed documents without carrying out optical character recognition (OCR) is considered. It can be statistically verified that important terms in technical articles are mainly printed in italic, bold, and all-capital style. A quick approach to detecting them is proposed here. This approach is based on the global shape heuristics of these styles of any font. Important words in a document are sometimes printed in larger size as well. A smart approach for the determination of font size is also presented. Detection of type styles helps in improving OCR performance, especially for reading italicized text. Another advantage to identifying word type styles and font size has been discussed in the context of extracting: (i) different logical labels;and (ii) important terms from the document. Experimental results on the performance of the approach on a large number of good quality, as well as degraded, document images are presented. © 2001 Springer-Verlag Berlin Heidelberg.

关键词： Optical character recognition

来源：评论

学校读者我要写书评

暂无评论

A comparative study between ISITRA and wavelet filters

A comparative study between ISITRA and wavelet filters

引用

IEEE INDICON 2004 - 1st India Annual Conference

作者： Singh, Yumnam Kirani Parui, Swapan Kumar Banerjee, Shuvransu Computer Vision and Pattern Recognition Unit Indian Statistical Institute Kolkata-700108

ISBN: (纸本)0780389093

ISITRA is a new scheme of signal decomposition and reconstruction. In ISITRA, the space of PRF sets is much larger and more well-behaved than that in the existing schemes like filter bank or wavelets. Since such a space is constrained, it is mapped to an unconstrained space in which an optimization technique can be applied to find optimal PRF sets in terms of some criterion. Our criterion here is based on Mean Square Error and the optimization technique used is genetic algorithms. Optimal PRF sets thus found perform better than the popular Daubechies' filters for a compression task. © 2004 IEEE.

关键词： Optical filters

来源：评论

学校读者我要写书评

暂无评论

Issues in searching for indian language web content

Issues in searching for indian language web content

引用

2nd ACM Workshop on Improving Non English Web Searching, iNEWS'08, Co-located with the 17th ACM Conference on Information and Knowledge Management, CIKM 2008

作者： Pal, Dipasree Majumder, Prasenjit Mitra, Mandar Mitra, Sukanya Sen, Aparajita Computer Vision and Pattern Recognition Unit Indian Statistical Institute Kolkata India

ISBN: (纸本)9781605584164

This paper looks at the problem of searching for Indian language (IL) content on the Web. Even though the amount of IL content that is available on the Web is growing rapidly, searching through this content using the most popular websearch engines poses certain problems. Since the popular search engines do not use any stemming / orthographic normalization for Indian languages, recall levels for IL searches can be low. We provide some examples to indicate the extent of this problem, and suggest a simple and efficient solution to the problem. Copyright 2008 ACM.

关键词： Search engines

来源：评论

学校读者我要写书评

暂无评论

A simple real-word error detection and correction using local word bigram and trigram 25

A simple real-word error detection and correction using loca...

引用

25th Conference on Computational Linguistics and Speech Processing, ROCLING 2013

作者： Samanta, Pratip Chaudhuri, Bidyut B. Computer Vision and Pattern Recognition Unit Indian Statistical Institute Kolkata India

ISBN: (纸本)9789573079262

Spelling error is broadly classified in two categories namely non word error and real word error. In this paper a localized real word error detection and correction method is proposed where the scores of bigrams generated by immediate left and right neighbour of the candidate word and the trigram of these three words are combined. A single character position error model is assumed so that if a word W is erroneous then the correct word belongs to the set of real words S generated by single character edit operation on W. The above combined score is calculated also on all members of S. These words are ranked in the decreasing order of the score. By observing the rank and using a rule based approach, the error decision and correction candidates are simultaneously selected. The approach gives comparable accuracy with other existing approaches but is computationally attractive. Since only left and right neighbor are involved, multiple errors in a sentence can also be detected (if the error occurs in every alternate words). © ROCLING *** rights reserved.

关键词： Error detection

来源：评论

学校读者我要写书评

暂无评论

An end-to-end system for bangla online handwriting recognition 15

An end-to-end system for bangla online handwriting recogniti...

引用

15th International Conference on Frontiers in Handwriting recognition, ICFHR 2016

作者： Bhattacharya, Soumik Maitra, Durjoy Sen Bhattacharya, Ujjwal Parui, Swapan K. Computer Vision and Pattern Recognition Unit Indian Statistical Institute Kolkata India

ISBN: (纸本)9781509009817

A few studies of online Bangla handwriting recognition such as isolated character recognition or limited vocabulary cursive word recognition are found in the literature. However, development of an end-to-end recognition system of unconstrained online Bangla handwritten texts has not been duly attempted so far. In the present report, we describe a similar system which takes a piece of continuous online handwritten Bangla texts as the input. It first segments the input texts into individual lines, each line into its constituent words and each word into sub-strokes. In the present study, 152 different symbols which include basic characters, character modifiers, frequently used conjunct characters, a few special characters and numerals have been considered. The entire set of sub-strokes obtained from the training sample set has been exhaustively studied by 3 experts and 76 different shapes of sub-strokes have been identified based on consensus among these experts. Also, it has been observed that a character may produce at most 3 sub-strokes. Since a piece of Bangla texts often contains either Bangla or English numerals, the present character set consists of both the numeral set and 3 numeral shapes are common to both the scripts. The proposed recognition system uses two classifiers, one for characters and the other for sub-strokes. Sub-strokes are fed to the character classifier in their temporal order. A single sub-stroke followed by two consecutive sub-strokes and finally three successive sub-strokes are passed to the character classifier and the first two top responses of the character classifier among the three cases are compared. If the difference is less than a threshold, the response of sub-stroke classifier is used to reach a final decision. The proposed system provided 94.3% character level accuracy on a test set consisting of 33,453 word samples written by 31 writers. © 2016 IEEE.

关键词： Character recognition

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：