In this paper, the off-line chinese character image is transformed into ellipse shape of basic chinese characters strokes in different position. the stroke "turning" and joint or crossover of strokes is comb...
详细信息
this paper proposes a novel method to deal withthe representation issue in texture classification. A learning framework of image descriptor is designed based on the Fisher separation criteria (FSC) to learn most reli...
详细信息
ISBN:
(数字)9783642193187
ISBN:
(纸本)9783642193170
this paper proposes a novel method to deal withthe representation issue in texture classification. A learning framework of image descriptor is designed based on the Fisher separation criteria (FSC) to learn most reliable and robust dominant pattern types considering intra-class similarity and inter-class distance. Image structures are thus be described by a new FSC-based learning (FBL) encoding method. Unlike previous handcraft-design encoding methods, such as the LBP and SIFT, supervised learning approach is used to learn an encoder from training samples. We find that such a learning technique can largely improve the discriminative ability and automatically achieve a good tradeoff between discriminative power and efficiency. the commonly used texture descriptor: local binary pattern (LBP) is taken as an example in the paper, so that we then proposed the FBL-LBP descriptor. We benchmark its performance by classifying textures present in the Outex_TC_0012 database for rotation invariant texture classification, Kth-TIPS2 database for material categorization and Columbia-Utrecht (CUReT) database for classification under different views and illuminations. the promising results verify its robustness to image rotation, illumination changes and noise. Furthermore, to validate the generalization to other problems, we extend the application also to face recognition and evaluate the proposed FBI, descriptor on the FERET face database. the inspiring results show that this descriptor is highly discriminative.
Background subtraction plays an important role in many computervision systems, yet in complex scenes it is still a challenging task, especially in case of illumination variations. In this work, we develop an efficien...
详细信息
ISBN:
(纸本)9783642193170
Background subtraction plays an important role in many computervision systems, yet in complex scenes it is still a challenging task, especially in case of illumination variations. In this work, we develop an efficient texture-based method to tackle this problem. First, we propose a novel adaptive epsilon LBP operator, in which the threshold is adaptively calculated by compromising two criterions, i.e. the description stability and the discriminative ability. then, the naive Bayesian technique is adopted to effectively model the probability distribution of local patterns in the pixel level, which utilizes only one single epsilon LBP pattern instead of epsilon LBP histogram of local region. Our approach is evaluated on several video sequences against the traditional methods. Experiments show that our method is suitable for various scenes, especially can robust handle illumination variations.
How to fuse static and dynamic information is a key issue in event analysis. In this paper, we present a novel approach to combine appearance and motion information together through a top-down manner for event recogni...
详细信息
ISBN:
(数字)9783642193187
ISBN:
(纸本)9783642193170
How to fuse static and dynamic information is a key issue in event analysis. In this paper, we present a novel approach to combine appearance and motion information together through a top-down manner for event recognition in real videos. Unlike the conventional bottom-up way, attention can be focused volitionally on top-down signals derived from task demands. A video is represented by a collection of spatio-temporal features, called video words by quantizing the extracted spatio-ternporal interest points (STIPs) from the video. We propose two approaches to build class specific visual or motion histograms for the corresponding features. One is using the probability of a class given a visual or motion word. High probability means more attention should be paid to this word. Moreover, in order to incorporate the negative information for each word, we propose to utilize the mutual information between each word and event label. High mutual information means high relevance between this word and the class label. Both methods not only can characterize two aspects of an event, but also can select the relevant;words, which are all discriminative to the corresponding event. Experimental results on the TRECVID 2005 and the HOHA video corpus demonstrate that the mean average precision has been improved by using the proposed method.
Developments in image processing techniques have made an easy retrieval for digital images. However, applications requiring content-based querying and searching of videos still remain challenging due to their huge amo...
详细信息
Video stylization transfers a source video into an artistic version while maintaining temporal coherence between adjacent frames. In this paper, we formulate the unsupervised example-based video stylization with Marko...
详细信息
ISBN:
(纸本)9781450306164
Video stylization transfers a source video into an artistic version while maintaining temporal coherence between adjacent frames. In this paper, we formulate the unsupervised example-based video stylization with Markov random field model. In our algorithm, we implement an improved optical flow algorithm to maintain temporal coherence while improve the accuracy of estimation along motion boundaries. We also extend our algorithm to the application of video personalization, in which human faces keep clear and distinguishable. A series of techniques are fused in video personalization, including face detection and alignment, motion flow, skin detection, and illumination blending. Given a source video and a style template image, our algorithm produces the stylized and/or personalized video(s) automatically. Experimental results demonstrate that our algorithm performs excellently in both video stylization and personalization. Copyright 2011 ACM.
In video surveillance, it is still a difficult task to segment moving object accurately in complex scenes, since most widely used algorithms are background subtraction. We propose an online and unsupervised technique ...
详细信息
ISBN:
(纸本)9783642193088
In video surveillance, it is still a difficult task to segment moving object accurately in complex scenes, since most widely used algorithms are background subtraction. We propose an online and unsupervised technique to find optimal segmentation in a Markov Random Field (MRF) framework. To improve the accuracy, color, locality, temporal coherence and spatial consistency are fused together in the framework. the models of color, locality and temporal coherence are learned online from complex scenes. A novel mixture of nonparametric regional model and parametric pixel-wise model is proposed to approximate the background color distribution. the foreground color distribution for every pixel is learned from neighboring pixels of previous frame. the locality distributions of background and foreground are approximated withthe nonparametric model. the temporal coherence is modeled with a Markov chain. Experiments on challenging videos demonstrate the effectiveness of our algorithm.
Humans are capable of describing objects using attributes, such as "the object looks circular and is man-made". Motivated by these high-level descriptions, we build a user-friendly 3D object retrieval system...
详细信息
ISBN:
(纸本)9781450306164
Humans are capable of describing objects using attributes, such as "the object looks circular and is man-made". Motivated by these high-level descriptions, we build a user-friendly 3D object retrieval system, where the user can browse the database and search for targeted objects using semantic attributes. the main advantage of our system is that it does not require the user to find or sketch a 3D object as the query for 3D object retrieval. Besides, to the best of our knowledge, our system has obtained the best retrieval performance on three popular benchmarks. Copyright 2011 ACM.
In document image analysis and especially in handwritten document image recognition, standard datasets play vital roles for evaluating performances of algorithms and comparing results obtained by different groups of r...
详细信息
In document image analysis and especially in handwritten document image recognition, standard datasets play vital roles for evaluating performances of algorithms and comparing results obtained by different groups of researchers. In this paper, an unconstrained Persian handwritten text dataset (PHTD) is introduced. the PHTD contains 140 handwritten documents of three different categories written by 40 individuals. Total number of text-lines and words/subwords in the dataset are 1787 and 27073, respectively. In most of the PHTD documents either an overlapping or a touching text-lines is present. the average number of text-lines in documents of the PHTD is 13. Two types of ground truths based on pixels information and content information are generated for the dataset. Providing these two types of ground truths for the PHTD, it can be utilized in many areas of document image processing such as sentence recognition/understanding, text-line segmentation, word segmentation, word recognition, and character segmentation. To provide a framework for other researches, recent text-line segmentation results on this dataset are also reported.
this paper proposes a novel approach to single image super-resolution. First, an image up-sampling scheme is proposed which takes the advantages of both bilateral filtering and mean shift image segmentation. then we u...
详细信息
ISBN:
(纸本)9781450306164
this paper proposes a novel approach to single image super-resolution. First, an image up-sampling scheme is proposed which takes the advantages of both bilateral filtering and mean shift image segmentation. then we use a shock filter to enhance strong edges in the initial up-sampling result and obtain an intermediate high-resolution image. Finally, we enforce a reconstruction constraint on the high-resolution image so that fine details can be inferred by back projection. Since strong edges in the intermediate result are enhanced, ringing artifacts can be suppressed in the back projection step. We compare our algorithm with several state-of-the-art image super-resolution algorithms. Qualitative and quantitative experimental results demonstrate that our approach performs the best. Copyright 2011 ACM.
暂无评论