Human action recognition is the process of labeling videos contain human motion with action classes. The run time complexity is one of the most important challenges in action recognition. In this paper, we address thi...
详细信息
Human action recognition is the process of labeling videos contain human motion with action classes. The run time complexity is one of the most important challenges in action recognition. In this paper, we address this problem using video abstraction techniques including key-frame extraction and video skimming. At first we extract key-frames and then skim the video clip by concatenating excerpts around the selected key-frames. This shorter sequence is used as input for classifier. Our proposed approach not only reduces the space complexity but also reduces the run time in both train and test steps. The experimental results provided on KTH action datasets show that the proposed method achieves good performance without losing considerable classification accuracy.
Motion estimation is a basic issue for many computervision tasks, such as human-computer interaction, motion objection detection and intelligent robot. In many practical scenes, the object movement goes with camera m...
详细信息
Motion estimation is a basic issue for many computervision tasks, such as human-computer interaction, motion objection detection and intelligent robot. In many practical scenes, the object movement goes with camera motion. Generally, motion descriptors directly based on optical flow are inaccurate and have low discrimination power. To this end, a novel motion correction method is proposed and a novel motion feature descriptor called the motion difference histogram (MDH) for recognising human action is proposed in this study. Motion estimation results are corrected by background motion estimation and MDH encodes the motion difference between the background and the objects. Experimental results on video shot with camera motion show that the proposed motion correction method is effective and the recognition accuracy of MDH is better than that of the state-of-the-art motion descriptor.
Video stylization transfers a source video into an artistic version while maintaining temporal coherence between adjacent frames. In this paper, we formulate the unsupervised example-based video stylization with Marko...
详细信息
ISBN:
(纸本)9781450306164
Video stylization transfers a source video into an artistic version while maintaining temporal coherence between adjacent frames. In this paper, we formulate the unsupervised example-based video stylization with Markov random field model. In our algorithm, we implement an improved optical flow algorithm to maintain temporal coherence while improve the accuracy of estimation along motion boundaries. We also extend our algorithm to the application of video personalization, in which human faces keep clear and distinguishable. A series of techniques are fused in video personalization, including face detection and alignment, motion flow, skin detection, and illumination blending. Given a source video and a style template image, our algorithm produces the stylized and/or personalized video(s) automatically. Experimental results demonstrate that our algorithm performs excellently in both video stylization and personalization. Copyright 2011 ACM.
In this paper, we present a scheme towards recognition of English character in multi-scale and multi-oriented environments. Graphical document such as map consists of text lines which appear in different orientation. ...
详细信息
Differential methods are frequently used techniques for optic flow computations. They can be classified into local methods such as the Lucas-Kanade technique or Bigün's structure tensor method, and into globa...
详细信息
This paper presents a pronominal anaphora resolution (PAR) approach that makes use of the global discourse knowledge along with other traditional features. So far the features used in finding the referent of an anapho...
详细信息
This paper presents a pronominal anaphora resolution (PAR) approach that makes use of the global discourse knowledge along with other traditional features. So far the features used in finding the referent of an anaphoric pronoun are computed locally. Normally the sentence containing the anaphor and a few sentences immediately before form the local context. In this process, the knowledge base gets updated as more and more of the discourse is processed. Keeping this approach as the core, the present paper explores use of some prior knowledge after examining the entire discourse (whole article). Addition of this processing step improves the PAR's efficiency. This improvement is demonstrated using ICON 2011 Bangla dataset.
This paper presents a new idea for improving text detection and recognition performances by detecting defects in the text detection results. Despite the rapid development of powerful deep learning based models for sce...
详细信息
Extraction of some meta-information from printed documents without an OCR approach is considered. It can be statistically verified that important terms in articles are printed in italic, bold and all capital style. De...
详细信息
Extraction of some meta-information from printed documents without an OCR approach is considered. It can be statistically verified that important terms in articles are printed in italic, bold and all capital style. Detection of these type styles helps in automatic extraction of the lines containing titles, authors' names, subtitles, references as well as sentences having important terms occurring in the text. It also helps in improving the OCR performance for reading the italic text. Some experimental results on the performance of the approach on good quality as well as degraded document images are presented.
Optical Character recognition (OCR) has been deployed in the past in different application areas such as automatic transcription and indexing of document images, reading aid for the visually impaired persons, postal a...
详细信息
The lower the resolution of a given text is, the more difficult it becomes to segment it into single characters. The resolution of screen-rendered text can be very low. This paper focuses on smoothed screen-rendered t...
详细信息
The lower the resolution of a given text is, the more difficult it becomes to segment it into single characters. The resolution of screen-rendered text can be very low. This paper focuses on smoothed screen-rendered text of very low resolution with typical x-heights of 4 to 7 pixels which is much lower than in other low resolution OCR situations. We propose a recognition-based segmentation algorithm which makes use of over segmentation by dynamic programming, candidate rating by single character classifiers and a graph based search algorithm for an optimal cut sequence. The algorithm is described in detail and experimental results are presented which show the performance on example screen- shot images taken from the public Screen-Word database.
暂无评论