The lower the resolution of a given text is, the more difficult it becomes to segment it into single characters. The resolution of screen-rendered text can be very low. This paper focuses on smoothed screen-rendered t...
详细信息
The lower the resolution of a given text is, the more difficult it becomes to segment it into single characters. The resolution of screen-rendered text can be very low. This paper focuses on smoothed screen-rendered text of very low resolution with typical x-heights of 4 to 7 pixels which is much lower than in other low resolution OCR situations. We propose a recognition-based segmentation algorithm which makes use of over segmentation by dynamic programming, candidate rating by single character classifiers and a graph based search algorithm for an optimal cut sequence. The algorithm is described in detail and experimental results are presented which show the performance on example screen- shot images taken from the public Screen-Word database.
Autonomous model building is a crucial trend in model based methods like AAMs. This paper introduces an approach that deals with non-linearities by detecting distinct sub-parts in the data. Sub-models each representin...
详细信息
ISBN:
(纸本)1901725294
Autonomous model building is a crucial trend in model based methods like AAMs. This paper introduces an approach that deals with non-linearities by detecting distinct sub-parts in the data. Sub-models each representing an individual sub-part are derived from a minimum description length criterion. Thereby the resulting clique of models is more compact and obtains a better generalization behavior than a single model. The proposed AAM clique generation deals with non-linearities in the data in a generic information theoretic manner reducing the necessity of user interaction during training.
Image clustering solely based on visual features without any knowledge or background information suffers from the problem of semantic gap. In this paper, we propose SS-NMF: a semi-supervised non-negative matrix factor...
详细信息
ISBN:
(纸本)9781595937025
Image clustering solely based on visual features without any knowledge or background information suffers from the problem of semantic gap. In this paper, we propose SS-NMF: a semi-supervised non-negative matrix factorization framework for image clustering. Accumulated relevance feedback in a CBIR system is treated as user provided supervision for guiding the image clustering. We consider the set of positive images in the feedback as constraints on the clustering specifying that the images "must" be clustered together. Similarly, negative images provide constraints specifying that they "cannot" be clustered along with the positive images. Through an iterative algorithm, we perform symmetric tri-factorization of the image-image similarity matrix to infer the clustering. Theoretically, we prove the correctness of SS-NMF by showing that the algorithm is guaranteed to converge. Through experiments conducted on general purpose image datasets, we demonstrate the superior performance of SS-NMF for clustering images effectively. Copyright 2007 ACM.
Learning-based methods have attracted a lot of research attention and led to significant improvements in low-light image enhancement. However, most of them still suffer from two main problems: expensive computational ...
详细信息
Embedding data into vector spaces is a very popular strategy of patternrecognition methods. When distances between embeddings are quantized, performance metrics become ambiguous. In this paper, we present an analysis...
详细信息
The recognition of screen-rendered text is a novel task. It is performed e.g. by translation tools which allow users to click on any text on the screen and give a translation. Also some commercial OCR programs start t...
详细信息
The recognition of screen-rendered text is a novel task. It is performed e.g. by translation tools which allow users to click on any text on the screen and give a translation. Also some commercial OCR programs start to address the problem of reading screenshots. Optical character recognition on screen-shot images can be very challenging due to very small and smoothed fonts. In order to build and compare recognition approaches for screen-rendered text, the availability of standard databases is a fundamental prerequisite. In this paper two freely available databases are presented, one that consists of annotated screenshot images of 28080 single characters and another holding 400 words extracted from documents plus 2 400 generated isolated words. Both databases include meta-information such as x-height, font type, style and rendering conditions. At the example of a developed recognition system, it is shown how these databases can serve for training, testing and optimization.
Automatic vehicle Make and Model recognition (MMR) system offers a competent way to vehicle classification and recognition systems. This paper proposes a real time while robust vehicle make and model recognition syste...
详细信息
Automatic vehicle Make and Model recognition (MMR) system offers a competent way to vehicle classification and recognition systems. This paper proposes a real time while robust vehicle make and model recognition system extracting the vehicle sub-image from the background and studies some sparse feature coding methods such as Orthogonal Matching Pursuit (OMP), some variation of Sparse Coding (SC) methods and compares them to choose the best one. Our method employs the sparse feature coding methods on dense Scale-Invariant Feature Transform (SIFT) features and Support Vector Machine (SVM) for classification. The proposed system is examined by an Iranian on road vehicles dataset, which its samples are in different point of views, various weather conditions and illuminations.
Human action recognition is the process of labeling videos contain human motion with action classes. The run time complexity is one of the most important challenges in action recognition. In this paper, we address thi...
详细信息
Human action recognition is the process of labeling videos contain human motion with action classes. The run time complexity is one of the most important challenges in action recognition. In this paper, we address this problem using video abstraction techniques including key-frame extraction and video skimming. At first we extract key-frames and then skim the video clip by concatenating excerpts around the selected key-frames. This shorter sequence is used as input for classifier. Our proposed approach not only reduces the space complexity but also reduces the run time in both train and test steps. The experimental results provided on KTH action datasets show that the proposed method achieves good performance without losing considerable classification accuracy.
Car plate detection is a key component in automatic license plate recognition system. This paper adopts an enhanced cascaded tree style learner framework for car plate detection using the hybrid object features includ...
详细信息
In this paper we present an algorithm for the recognition of 1D barcodes using camera phones, which is highly robust regarding the the typical image distortions. We have created a database of barcode images, which cov...
详细信息
In this paper we present an algorithm for the recognition of 1D barcodes using camera phones, which is highly robust regarding the the typical image distortions. We have created a database of barcode images, which covers typical distortions, such as inhomogeneous illumination, reflections, or blurriness due to camera movement. We present results from experiments with over 1,000 images from this database using a Matlab implementation of our algorithm, as well as experiments on the go, where a Symbian C++ implementation running on a camera phone is used to recognize barcodes in daily life situations. The proposed algorithm shows a close to 100% accuracy in real life situations and yields a very good resolution dependent performance on our database, ranging from 90.5% (640 × 480) up to 99.2% (2592 × 1944). The database is freely available for other researchers.
暂无评论