检索结果-内蒙古大学图书馆

document image decoding by heuristic search

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 1996年第9期18卷 945-950页

作者： Kam, AC Kopec, GE MIT CAMBRIDGEMA 02139 XEROX CORP PALO ALTO RES CTRPALO ALTOCA 94304

This correspondence describes an approach to reducing the computational cost of document image decoding by viewing it as a heuristic search problem. The kernel of the approach is a modified dynamic programming (DP) algorithm, called the iterated complete path (ICP) algorithm, that is intended for use with separable source models. A set of heuristic functions are presented for decoding formatted text with ICP. Speedups of 3-25 over DP have been observed when decoding text columns and telephone yellow pages using ICP and the proposed heuristics.

关键词： document image decoding Markov models heuristic search dynamic programming

来源：评论

学校读者我要写书评

暂无评论

document image decoding using iterated complete path search

Document image decoding using iterated complete path search

引用

8th Annual document Recognition and Retrieval Conference

作者： Minka, TP Bloomberg, DS Popat, K MIT Cambridge MA 02139 USA

ISBN: (纸本)0819439851

The computation time of document image decoding can be significantly reduced by employing heuristics in the search for the best decoding of a text line. By using a cheap upper bound on template match scores, up to 99.9% of the potential template matches can be avoided. In the Iterated Complete Path method, template matches are performed only along the best path found by dynamic programming on each iteration. When the best path stabilizes, the decoding is optimal and no more template matches need be performed. Computation can be further reduced in this scheme by exploiting the incremental nature of the Viterbi iterations. Because only a few trellis edge weights have changed since the last iteration, most of the backpointers do not need to be updated. We describe how to quickly identify these backpointers, without forfeiting optimality of the path. Together these improvements provide a 30x speedup over previous implementations of document image decoding.

关键词： document image decoding dynamic programming Viterbi heuristic search optical character recognition iterated complete path MAP template matching

来源：评论

学校读者我要写书评

暂无评论

Supervised template estimation for document image decoding

引用

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 1997年第12期19卷 1313-1324页

作者： Kopec, GE Lomelin, M MICROSOFT CORP SEATTLEWA

An approach to supervised training of character templates from page images and unaligned transcriptions is proposed. The template training problem is formulated as one of constrained maximum likelihood parameter estimation within the document image decoding framework. This leads to a three-phase iterative training algorithm consisting pf transcription alignment, aligned template estimation (ATE), and channel estimation steps. The maximum likelihood ATE problem is shown to be NP-complete and, thus, an approximate solution approach is developed. An evaluation of the training procedure in a document-specific decoding task, using the University of Washington UW-II database of scanned technical journal articles, is described.

关键词： document image decoding Markov models template estimation character recognition document recognition maximum likelihood

来源：评论

学校读者我要写书评

暂无评论

Adding linguistic constraints to document image decoding: Comparing the iterated complete path and stack algorithms

Adding linguistic constraints to document image decoding: Co...

引用

8th Annual document Recognition and Retrieval Conference

作者： Popat, K Greene, DH Romberg, JK Bloomberg, DS Xerox Corp Palo Alto Res Ctr Palo Alto CA 94304 USA Rice Univ ECE Dept Houston TX 77251 USA

ISBN: (纸本)0819439851

Beginning with an observed document image and a model of how the image has been degraded, document image decoding recognizes printed text by attempting to find a most probable path through a hypothesized Markov source. The incorporation of linguistic constraints, which are expressed by a sequential predictive probabilistic language model, can improve recognition accuracy significantly in the case of moderately to severely corrupted documents. Two methods of incorporating linguistic constraints in the best-path search are described analyzed and compared. The first, called the iterated complete path algorithm, involves iteratively rescoring complete paths using conditional language model probability distributions of increasing order, expanding state only as necessary with each iteration. A property of this approach is that it results in a solution that is exactly optimal with respect to the specified source, degradation, and language models;no approximation is necessary. The second approach considered is the Stack algorithm, which is often used in speech recognition and in the decoding of convolutional codes. Experimental results are presented in which text line images that have been corrupted in a known way are recognized using both the ICP and Stack algorithms. This controlled experimental setting preserves many of the essential features and challenges of real text line decoding, while highlighting the important algorithmic issues.

关键词： document image decoding optical character recognition lexical language modeling hidden Markov models dynamic programming Viterbi algorithm stack algorithm list decoding convolutional decoding

来源：评论

学校读者我要写书评

暂无评论

Turbo recognition: a statistical approach to layout analysis

Turbo recognition: a statistical approach to layout analysis

引用

8th Annual document Recognition and Retrieval Conference

作者： Tokuyasu, TA Chou, PA Univ Calif Berkeley Dept Comp Sci Berkeley CA 94720 USA

ISBN: (纸本)0819439851

Turbo recognition (TR) is a communication theory approach to the analysis of rectangular layouts, in the spirit of document image decoding. The TR algorithm, inspired by turbo decoding, is based on a generative model of image production, in which two grammars are used simultaneously to describe structure in orthogonal (horizontal and vertical) directions. This enables TR to strictly embody non-local constraints that cannot be taken into account by local statistical methods. This basis in finite state grammars also allows TR to be quickly retargetable to new domains. We illustrate some of the capabilities of TR with two examples involving realistic images. While TR, like turbo decoding, is not guaranteed to recover the statistically optimal solution. we present an experiment that demonstrates its ability to produce optimal or near-optimal results on a simple yet nontrivial example, the recovery of a filled rectangle in the midst of noise. Unlike methods such as stochastic context free grammars and exhaustive search, which are often intractable beyond small images, turbo recognition scales linearly with image size, suggesting TR as an efficient yet near-optimal approach to statistical layout analysis.

关键词： document image decoding layout analysis graphical grammar turbo ML MAP likelihood posteriori stochastic

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：