检索结果-内蒙古大学图书馆

International Conference on Document Analysis and recognition

作者： U. Pal B.B. Chaudhuri Computer Vision and Pattern Recognition Unit Indian Statistical Institute Calcutta India

In a multi-lingual country like India, a document page may contain more than one script form. Under the three-language formula, the document may be printed in English, Devnagari and one of the other official indian languages. For OCR of such a document page, it is necessary to separate these three script forms before feeding them to the OCRs of individual scripts. In this paper, an automatic technique of separating the text lines using script characteristics and shape based features is presented. At present, the system has an overall accuracy of about 98.5%.

关键词： Natural languages Optical character recognition software Shape Optical filters computer vision pattern recognition Writing Read only memory Character generation

来源：评论

学校读者我要写书评

暂无评论

indian language multimedia and information retrieval

Indian language multimedia and information retrieval

引用

International Conference on Computational Intelligence and Multimedia Applications

作者： B.B. Chaudhuri Computer Vision and Pattern Recognition Unit Indian Statistical Institute Calcutta India

Over the last decade or so, remarkable developments in computer technology have given a major impetus to research in the field of multimedia. With the proliferation of the Internet and the increasingly widespread use of sophisticated computers, the multimedia revolution has arrived in India as well. It is therefore time to take stock of the situation: to evaluate how existing techniques can be used in the indian context and to determine what new methods have to be developed. This paper summarizes the current state of multimedia technology in India and points to directions for further work. As more and more people in India begin to use computers and the Internet, multimedia capabilities will start playing a vital role in solving problems in many different areas. Education is probably one of the most important areas where multimedia technology can have a major impact. Already, multimedia educational systems are being developed in indian languages. Several interactive encyclopaedia-like environments are also being marketed on CD-ROMs, and cover topics ranging from indian classical music to indian history, using text, images and sound. Some of the other possible applications of multimedia technology are: the development of digital libraries, news and information dissemination services, medicine, business and commerce, and the entertainment industry. Multimedia information technology is thus poised to become an exciting area for research and development activities in India.

关键词： Information retrieval Internet computer science education Educational technology Multimedia systems Music History Software libraries Biomedical imaging Business

来源：评论

学校读者我要写书评

暂无评论

Automatic separation of machine-printed and hand-written text lines

Automatic separation of machine-printed and hand-written tex...

引用

International Conference on Document Analysis and recognition

作者： U. Pal B.B. Chaudhuri Indian Statistical Institute Computer Vision and Pattern Recognition Unit Calcutta India

There are many types of documents where machine-printed and hand-written texts appear intermixed. Since the optical character recognition (OCR) methodologies for machine-printed and hand-written texts are different, it is necessary to separate these two types of text before feeding them to the respective OCR systems. In this paper, we present such a scheme for both Bangla and Devnagari characters. The scheme is based on the structural and statistical features of the machine-printed and hand-written text lines. The classification scheme has an accuracy of about 98.3%.

关键词： Natural languages Optical character recognition software Neural networks computer vision pattern recognition Statistics Handwriting recognition Image segmentation Data mining Histograms

来源：评论

学校读者我要写书评

暂无评论

Automatic detection of italic, bold and all-capital words in document images

Automatic detection of italic, bold and all-capital words in...

引用

International Conference on pattern recognition

作者： B.B. Chaudhuri U. Garain Computer Vision and Pattern Recognition Unit Indian Statistical Institute Calcutta India

We propose simple and fast algorithms for detection of italic, bold and all-capital words without doing actual character recognition. We present a statistical study which reveals that the detection of such words may play a key role in automatic information retrieval from documents. Moreover, detection of italic words can be used to improve the recognition accuracy of a text recognition system. Considerable number of document images have been tested and our algorithms give accurate results on all the tested images, and the algorithms are very easy to implement.

关键词： Optical character recognition software Information retrieval Character recognition Text recognition Testing Books computer vision pattern recognition Software systems Degradation

来源：评论

学校读者我要写书评

暂无评论

Skew angle detection of digitized indian script documents

引用

IEEE TRANSACTIONS ON pattern ANALYSIS AND MACHINE INTELLIGENCE 1997年第2期19卷 182-186页

作者： Chaudhuri, BB Pal, U Computer Vision and Pattern Recognition Unit Indian Statistical Institute India

Skew angle detection of scanned documents containing most popular indian scripts (Devnagari and Bangla) is considered. Most characters in these scripts have horizontal lines at the top, called headlines. The character head lines mostly join one another in a word and the word appears as a single component. In the proposed method the components are at first labeled. The upper envelope of a component is found by columnwise scanning from an imaginary line above the component. Portions of upper envelope satisfying the properties of digital straight line are detected. They are clustered as belonging to single text lines. Estimates from individual clusters are combined to get the skew angle. Apart from accuracy and efficiency, an advantage of the method is that character segmentation and zone detection can be readily done from head line information, which is useful in Optical Character recognition approaches of these scripts.

关键词： document processing skew detection optical character recognition (OCR) document structure analysis digital library

来源：评论

学校读者我要写书评

暂无评论

A novel multiseed nonhierarchical data clustering technique

引用

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS 1997年第5期27卷 871-877页

作者： Chaudhuri, D Chaudhuri, BB Computer Vision and Pattern Recognition Unit Indian Statistical Institute Calcutta India

Clustering techniques such as K-means and Forgy as well as their improved version ISODATA group data around one seed point for each cluster. It is well known that these methods do not work well if the shape of the cluster is elongated or nonconvex. We argue that for a elongated or nonconvex shaped cluster, more than one seed is needed, In this paper a multiseed clustering algorithm is proposed. A density based representative point selection algorithm is used to choose the initial seed points. To assign several seed points to one cluster, a minimal spanning tree guided novel technique is proposed. Also, a border point detection algorithm is proposed for the detection of shape of the cluster. This border in turn signifies whether the cluster is elongated or not. Experimental results show the efficiency of this clustering technique.

关键词： classification minimal spanning tree multiseed clustering pattern recognition

来源：评论

学校读者我要写书评

暂无评论

Texture synthesis by a neural network model

引用

NEURAL COMPUTING & APPLICATIONS 1997年第1期6卷 2-11页

作者： Chaudhuri, BB Kundu, P Computer Vision and Pattern Recognition Unit Indian Statistical Institute Calcutta India

In this paper we propose a neural network model to synthesise texture images. The model is based on a continuous Hopfield-like network where each pixel of the image is occupied by a neuron that is eight-connected to its neighbours. A state of the neuron denotes a certain grey level of the corresponding pixel. The firing of the neuron changes its state, and hence the grey level of the corresponding pixel. Different two-tone and grey-tone texture images can be synthesised by manipulating the connection weights and by varying the algorithm iteration number. For grey-tone texture synthesis, a Markov chain principle has been employed to decide on the multiple state transition of a neuron. The model can be employed for texture propagation with the advantage that it allows propagation without showing any blocky effect.

关键词： computer graphics image processing texture synthesis

来源：评论

学校读者我要写书评

暂无评论

An improved backpropagation neural network for detection of road-like features in satellite imagery

引用

INTERNATIONAL JOURNAL OF REMOTE SENSING 1997年第16期18卷 3379-3394页

作者： Bhattacharya, U Parui, SK Computer Vision and Pattern Recognition Unit Indian Statistical Institute Calcutta 700035 203 B. T. Road India

This paper presents an application of backpropagation neural network for the detection of linear structures in remote-sensing images. The purpose of the approach is two-fold. First, to exploit the advantages of a neural network classifier over the tranditional ones. Second, to avoid the strategic phases of enhancement and thresholding. Once the network is learnt, the classification scheme is real-time. Two critical issues in the present approach an the selection of the network architecture and the rate of convergence of learning. Solutions to these two problems are proposed. Experimental results on IRS and SPOT images are presented. Satisfactory classification results have been obtained using the network.

关键词： Remote sensing

来源：评论

学校读者我要写书评

暂无评论

Automatic separation of words in multi-lingual multi-script indian documents

Automatic separation of words in multi-lingual multi-script ...

引用

International Conference on Document Analysis and recognition

作者： U. Pal B.B. Chaudhuri Computer Vision and Pattern Recognition Unit Indian Statistical Institute Calcutta India

In a multi-lingual country like India, a document may contain more than one script forms. For such a document it is necessary to separate different script forms before feeding them to OCRs of individual script. In this paper an automatic word segmentation approach is described which can separate Roman, Bangla and Devnagari scripts present in a single document. The approach has a tree structure where at first Roman script words are separated using the 'headline' feature. The headline is common in Bangla and Devnagari but absent in Roman. Next, Bangla and Devnagari words are separated using some finer characteristics of the character set although recognition of individual character is avoided. At present, the system has an overall accuracy of 96.09%.

关键词： Optical character recognition software Natural languages computer vision Shape pattern recognition Tree data structures Character recognition Continents Cleaning Character generation

来源：评论

学校读者我要写书评

暂无评论

An OCR system to read two indian language scripts: Bangla and Devnagari (Hindi)

An OCR system to read two Indian language scripts: Bangla an...

引用

International Conference on Document Analysis and recognition

作者： B.B. Chaudhuri U. Pal Computer Vision and Pattern Recognition Unit Indian Statistical Institute Calcutta India

An OCR system is proposed that can read two indian language scripts: Bangla and Devnagari (Hindi), the most popular ones in the indian subcontinent. These scripts, having the same origin in ancient Brahmi script, have many features in common and hence a single system can be modeled to recognize them. In the proposed model, document digitization, skew detection, text line segmentation and zone separation, word and character segmentation, character grouping into basic, modifier and compound character category are done for both scripts by the same set of algorithms. The feature sets and classification tree as well as the knowledge base required for error correction (such as lexicon) differ for Bangla and Devnagari. The system shows a good performance for single font scripts printed on clear documents.

关键词： Optical character recognition software Natural languages Error correction Character recognition Switches computer vision pattern recognition Cleaning Text recognition Writing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：