检索结果-内蒙古大学图书馆

International Conference on Digital Image Computing Techniques and Applications (DICTA)

作者： Lu, Haiyun Kowalkiewicz, Marek SAP Asia Pte Ltd SAP Res Singapore 117440 Singapore

ISBN: (纸本)9781467321808;9781467321792

In this paper we present a robust method to detect handwritten text from unconstrained drawings on normal whiteboards. Unlike printed text on documents, free form handwritten text has no pattern in terms of size, orientation and font and it is often mixed with other drawings such as lines and shapes. Unlike handwritings on paper, handwritings on a normal whiteboard cannot be scanned so the detection has to be based on photos. Our work traces straight edges on photos of the whiteboard and builds graph representation of connected components. We use geometric properties such as edge density, graph density, aspect ratio and neighborhood similarity to differentiate handwritten text from other drawings. The experiment results show that our method achieves satisfactory precision and recall. Furthermore, the method is robust and efficient enough to be deployed in a mobile device. This is an important enabler of business applications that support whiteboard-centric visual meetings in enterprise scenarios.

关键词： Whiteboards STRAIGHT-EDGE Handwriting text segmentation drawings drawing FONT Graph representations connected component texts

来源：评论

学校读者我要写书评

暂无评论

Boosting text segmentation via progressive classification

引用

KNOWLEDGE AND INFORMATION SYSTEMS 2008年第3期15卷 285-320页

作者： Cesario, Eugenio Folino, Francesco Locane, Antonio Manco, Giuseppe Ortale, Riccardo CNR ICAR CNR Inst High Performance Comp & Networks I-87036 Arcavacata Di Rende Italy

A novel approach for reconciling tuples stored as free text into an existing attribute schema is proposed. The basic idea is to subject the available text to progressive classification, i.e., a multi-stage classification scheme where, at each intermediate stage, a classifier is learnt that analyzes the textual fragments not reconciled at the end of the previous steps. Classification is accomplished by an ad hoc exploitation of traditional association mining algorithms, and is supported by a data transformation scheme which takes advantage of domain-specific dictionaries/ontologies. A key feature is the capability of progressively enriching the available ontology with the results of the previous stages of classification, thus significantly improving the overall classification accuracy. An extensive experimental evaluation shows the effectiveness of our approach.

关键词： schema reconciliation text segmentation classification

来源：评论

学校读者我要写书评

暂无评论

New approach for text segmentation using a stroke filter

引用

SIGNAL PROCESSING 2008年第7期88卷 1907-1916页

作者： Jung, Cheolkon Liu, Qifeng Kim, Joongkyu Sungkyunkwan Univ Sch Informat Commun & Engn Suwon 440746 Kyunggido South Korea Samsung Adv Inst Technol Yongin 446712 Kyunggido South Korea

We propose a new method for achieving robust text segmentation in images by using a stroke filter. It is known that to segment text accurately and robustly from a complex background is a very difficult task. Most of the existing methods are sensitive to text color, size, font, and background clutter, because they use simple segmentation methods or require prior knowledge about text shape. In this paper, we attempt to consider the intrinsic characteristics of the text by using the stroke filter and design a new and robust algorithm for text segmentation. First, we describe the stroke filter briefly based on local region analysis. Second, the determination of text color polarity and local region growing procedures are performed successively based on the response of the stroke filter. Finally, the feedback procedure by the recognition score from an optical character recognition (OCR) module is used to improve the performance of text segmentation. By means of experiments on a large database, we demonstrate that the performance of our method is quite impressive from the viewpoints of the accuracy and robustness. (c) 2008 Elsevier B.V. All rights reserved.

关键词： text segmentation a stroke filter color polarity determination text information extraction

来源：评论

学校读者我要写书评

暂无评论

A dynamic programming algorithm for linear text segmentation

引用

JOURNAL OF INTELLIGENT INFORMATION SYSTEMS 2004年第2期23卷 179-197页

作者： Fragkou, P Petridis, V Kehagias, A

In this paper we introduce a dynamic programming algorithm which performs linear text segmentation by global minimization of a segmentation cost function which incorporates two factors: (a)within-segment word similarity and (b) prior information about segment length. We evaluate segmentation accuracy of the algorithm by precision, recall and Beeferman's segmentation metric. On a segmentation task which involves Choi's text collection, the algorithm achieves the best segmentation accuracy so far reported in the literature. The algorithm also achieves high accuracy on a second task which involves previously unused texts.

关键词： text segmentation information retrieval document retrieval machine learning

来源：评论

学校读者我要写书评

暂无评论

Statistical models for text segmentation

引用

MACHINE LEARNING 1999年第1-3期34卷 177-210页

作者： Beeferman, D Berger, A Lafferty, J Carnegie Mellon Univ Sch Comp Sci Pittsburgh PA 15213 USA

This paper introduces a new statistical approach to automatically partitioning text into coherent segments. The approach is based on a technique that incrementally builds an exponential model to extract features that are correlated with the presence of boundaries in labeled training text. The models use two classes of features: topicality features that use adaptive language models in a novel way to detect broad changes of topic, and cue-word features that detect occurrences of specific words, which may he domain-specific, that tend to be used near segment boundaries, Assessment of our approach on quantitative and qualitative grounds demonstrates its effectiveness in two very different domains, Wall Street Journal news articles and television broadcast news story transcripts. Quantitative results on these domains are presented using a new probabilistically motivated error metric, which combines precision and recall in a natural and flexible way. This metric is used to make a quantitative assessment of the relative contributions of the different feature types, as well as a comparison with decision trees and previously proposed text segmentation algorithms.

关键词： exponential models text segmentation maximum entropy inductive learning natural language processing decision trees language modeling

来源：评论

学校读者我要写书评

暂无评论

text segmentation of Consumer Magazines in PDF Format

Text Segmentation of Consumer Magazines in PDF Format

引用

11th International Conference on Document Analysis and Recognition (ICDAR)

作者： Fan, Jian Hewlett Packard Labs Palo Alto CA 94304 USA

ISBN: (纸本)9780769545202

text segmentation is usually the first step taken towards the reuse and repurposing of PDF documents. Through experimental evaluation, we found that the leading text segmentation algorithms have limitations for contemporary consumer magazines. We propose a new local homogeneity measure based on line space, and incorporate this new feature into a region growing algorithm. Using a fixed set of parameters, our algorithm achieved robust performance on PDF magazines with wide-ranging layouts and styles.

关键词： page segmentation text segmentation PDF analysis

来源：评论

学校读者我要写书评

暂无评论

Using LSA and text segmentation to improve automatic Chinese dialogue text summarization

引用

Journal of Zhejiang University-Science A(Applied Physics & Engineering) 2007年第1期8卷 79-87页

作者： LIU Chuan-han WANG Yong-cheng ZHENG Fei LIU De-rong Department of Computer Science and Engineering Shanghai Jiao Tong University Shanghai 200030 China Center for Biomimetic Sensing and Control Research Institute of Intelligent Machines Chinese Academy of Sciences Hefei 230031 China

Automatic Chinese text summarization for dialogue style is a relatively new research area. In this paper, Latent Semantic Analysis （LSA） is first used to extract semantic knowledge from a given document, all question paragraphs are identified, an automatic text segmentation approach analogous to text＇filing is exploited to improve the precision of correlating question paragraphs and answer paragraphs, and finally some ＂important＂ sentences are extracted from the generic content and the question-answer pairs to generate a complete summary. Experimental results showed that our approach is highly efficient and improves significantly the coherence of the summary while not compromising informativeness.

关键词： Automatic text summarization Latent semantic analysis (LSA) text segmentation Dialogue style Coherence Question-answer pairs

来源：评论

学校读者我要写书评

暂无评论

text segmentation in complex background based on color and scale information of character strokes

引用

8th Pacific Rim Conference on Multimedia

作者： Wang, Weiqiang Fu, Libo Gao, Wen Chinese Acad Sci Inst Comp Technol Beijing 100080 Peoples R China

ISBN: (纸本)9783540772545

This paper presents a robust approach to segmenting text embedded in complex background. Our approach consists of four steps: smart sampling, unsupervised clustering, the Bayesian decision, post-processing. The experimental results show that it works effectively, and is more efficient in removing complex background residues than the popular K-means method.

关键词： text segmentation character stroke complex background

来源：评论

学校读者我要写书评

暂无评论

text segmentation with Topic Modeling and Entity Coherence 1

引用

16th International Conference on Hybrid Intelligent Systems (HIS) / 8th World Congress on Nature and Biologically Inspired Computing (NaBIC)

作者： John, Adebayo Kolawole Di Caro, Luigi Boella, Guido Univ Turin Dipartimento Informat Corso Svizzera 185 I-10149 Turin Italy

ISBN: (数字)9783319529417

ISBN: (纸本)9783319529417;9783319529400

This paper describes a system which uses entity and topic coherence for improved text segmentation (TS) accuracy. First, Linear Dirichlet Allocation (LDA) algorithm was used to obtain topics for sentences in the document. We then performed entity mapping across a window in order to discover the transition of entities within sentences. We used the information obtained to support our LDA-based boundary detection for proper boundary adjustment. We report the significance of the entity coherence approach as well as the superiority of our algorithm over existing works.

关键词： text segmentation Entity coherence Linear dirichlet allocation Topic modeling

来源：评论

学校读者我要写书评

暂无评论

Conditional random field for text segmentation from images with complex background

引用

PATTERN RECOGNITION LETTERS 2010年第14期31卷 2295-2308页

作者： Li, Minhua Bai, Meng Wang, Chunheng Xiao, Baihua Shandong Univ Sci & Technol Elect & Informat Engn Dept Jinan 250031 Shandong Peoples R China Chinese Acad Sci Inst Automat Key Lab Complex Syst & Intelligence Sci Beijing 100190 Peoples R China

text contained in images and video frames provide important clues for information indexing and retrieval. But it is difficult to segment text from images, especially those images with complex background. This paper presents a new conditional random field approach, in which contextual features are introduced into text segmentation. Local visual information and contextual label information are integrated into a conditional random field by several components. Some components focus on visual image information to predict the category within the image sites, while others focus on contextual label information to determine the patterns within the label field. Integrating contextual label information in conditional random field can effectively resolve local ambiguities and improve text segmentation performance in complex background. The comparing results demonstrate that the proposed method outperforms other methods for text segmentation from complex background. (C) 2010 Elsevier B.V. All rights reserved.

关键词： Conditional random field (CRF) text segmentation Complex background Local information Contextual label information

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：