The statistical appearance of the face can vary due to various factors such as pose, occlusion, expression and back-ground which makes it a challenging task to have an efficient Face Recognition (FR) system. This pape...
详细信息
ISBN:
(纸本)1595930361
The statistical appearance of the face can vary due to various factors such as pose, occlusion, expression and back-ground which makes it a challenging task to have an efficient Face Recognition (FR) system. This paper proposes 4 novel techniques viz., Entropy based Face Segregation (EFS) as pre-processing technique, Double Wavelet Noise Removal (DWNR) as pre-processing technique, 1D Stationary Wavelet Transform (SWT) as Feature Extractor and Conservative Binary Particle Optimization (CBPSO) as Feature Selector to enhance the performance of the system. EFS is used to segregate the facial region, thus removing the cluttered background. DWNR has unique combination of 2D Discrete Wavelet Transform (DWT), Wiener Filter and 2D SWT for image denoising and contrast enhancement. The pre-processed image is then fed to unique combination of 1D DWT, 1D SWT and 1D Discrete Cosine Transform (DCT) to extract essential features. CBPSO is used to select very optimum feature subset and significantly reduce the computation time. The proposed algorithm is experimented on four benchmark databases viz., Color FERET, CMU PIE, Pointing Head Pose and Georgia Tech. Copyright 2014 ACM.
This paper proposes knowledge sharing and cooperation based Adaptive Boosting (KSC-AdaBoost) for supervised collaborative learning in presence of two different feature spaces (views) representing a training example. I...
详细信息
ISBN:
(纸本)1595930361
This paper proposes knowledge sharing and cooperation based Adaptive Boosting (KSC-AdaBoost) for supervised collaborative learning in presence of two different feature spaces (views) representing a training example. In such a binary learner space, two learner agents are trained on the two feature spaces. Difficulty of a training example is ascertained not only by classification performance of an individual learner but also by overall group performance on that training example. Group learning is enhanced by a novel algorithm for assigning weight to training set data. Three different models of KSC-AdaBoost are proposed for agglomerating decisions of the two learners. KSC-AdaBoost out-performs traditional AdaBoost and some recent variants of AdaBoost in terms of convergence rate of training set error and generalization accuracy. The paper then presents KSC-AdaBoost based hierarchical model for accurate eye region localization followed by fuzzy rule based system for robust eye center detection. Exhaustive experiments on five publicly available popular datasets reveal the viability of the learning models and superior eye detection accuracy over recent state-of-the-art algorithms. Copyright 2014 ACM.
Head pose classification from images acquired from far field cameras is a challenging problem because of the low resolution, blur and noise due to subject movements. Further there exists a domain shift in head pose cl...
详细信息
ISBN:
(纸本)1595930361
Head pose classification from images acquired from far field cameras is a challenging problem because of the low resolution, blur and noise due to subject movements. Further there exists a domain shift in head pose classification between training (source) and testing (target) images. Also more often the head poses in the target set may not exist in the source set and acquiring sufficient samples for training is quite expensive. In this paper, we propose a novel framework to address multi-view unseen head pose classification where the target set belongs to a different domain and has more classes than in the source. A correlated subspace is first derived using Canonical Correlation Analysis (CCA) between corresponding head poses in the source (stationary subjects) and target set (moving subjects). A distance based Domain Adaptation technique is then used in the correlation subspace for classification of unseen head pose in the target set. Experimental results confirm the effectiveness of our approach in improving the classification performance over the state-of-art. Copyright is held by the authors.
The paper presents a novel learning-based framework to identify tables from scanned document images. The approach is designed as a structured labeling problem, which learns the layout of the document and labels its va...
详细信息
ISBN:
(纸本)1595930361
The paper presents a novel learning-based framework to identify tables from scanned document images. The approach is designed as a structured labeling problem, which learns the layout of the document and labels its various entities as table header, table trailer, table cell and non-table region. We develop features which encode the foreground block characteristics and the contextual information. These features are provided to a fixed point model which learns the inter-relationship between the blocks. The fixed point model attains a contraction mapping and provides a unique label to each block. We compare the results with Condition Random Fields(CRFs). Unlike CRFs, the fixed point model captures the context information in terms of the neighbourhood layout more efficiently. Experiments on the images picked from UW-III (University of Washington) dataset, UNLV dataset and our dataset consisting of document images with multi-column page layout, show the applicability of our algorithm in layout analysis and table detection. Copyright 2014 ACM.
Content Based image Retrieval (CBIR) techniques retrieve similar digital images from a large database. As the user often does not provide any clue (indication) of the region of interest in a query image, most methods ...
详细信息
ISBN:
(纸本)1595930361
Content Based image Retrieval (CBIR) techniques retrieve similar digital images from a large database. As the user often does not provide any clue (indication) of the region of interest in a query image, most methods of CBIR rely on a representation of the global content of the image. The desired content in an image is often localized (e.g. car appearing salient in a street) instead of being holistic, demanding the need for an object-centric CBIR. We propose a biologically inspired framework WOW ("What"Object is "Where") for this purpose. Design of WOW framework is motivated by the cognitive model of human visual perception and feature integration theory (FIT). The key contributions in the proposed approach are: (i) Feedback mechanism between Recognition ("What") and Localization ("Where") modules (both supervised), for a cohesive decision based on mutual consensus;(ii) Hierarchy of visual features (based on FIT) for an efficient recognition task. Integration of information from the two channels ("What" and "Where") in an iterative feedback mechanism, helps to filter erroneous contents in the outputs of individual modules. Finally, using a similarity criteria based on HOG features (spatially localized by WOW) for matching, our system effectively retrieves a set of rank-ordered samples from the gallery. Experimentation done on various real-life datasets (including PASCAL) exhibits the superior performance of the proposed method. Copyright 2014 ACM.
Super pixels, which are a result of over-segmentation provide a reasonable compromise between working at pixel level versus with few optimally segmented regions. One fundamental challenge is that of defining the searc...
详细信息
ISBN:
(纸本)1595930361
Super pixels, which are a result of over-segmentation provide a reasonable compromise between working at pixel level versus with few optimally segmented regions. One fundamental challenge is that of defining the search space for merging. A naive approach of performing iterative clustering on the local neighborhood would be prone to under segmentation. In this paper, we develop a framework for generating non-compact super pixels by performing clustering on compact super pixels. We define the optimal search space by generating both over-segmented and under-segmented clustering of compact super pixels. Using this spatial information of the under-segmented scale, we look to improve the over-segmented scale. Our work is based on performing Kernel Density Estimation in 1D and further refining it using angular quantization. In all we propose three angular quantization formulations to generate the three scales of segmentation. Our results and comparison with the state-of-the-art super pixel algorithms show that merging a large number of super pixels with our algorithm is able to provide better results than using the underlying super pixel algorithm to obtain a smaller number of super pixels. Copyright 2014 ACM.
In this paper, we study methods for learning classifiers for the case when there is a variation introduced by an underlying continuous parameter θ representing transformations like blur, pose, time, etc. First, we co...
详细信息
ISBN:
(纸本)1595930361
In this paper, we study methods for learning classifiers for the case when there is a variation introduced by an underlying continuous parameter θ representing transformations like blur, pose, time, etc. First, we consider the task of learning dictionary-based representation for such cases. Sparse representations driven by data-derived dictionaries have produced state-of-the-art results in various image restoration and classification tasks. While significant advances have been made in this direction, most techniques have focused on learning a single dictionary to represent all variations in the data. In this paper, we show that dictionary learning can be significantly improved by explicitly parameterizing the dictionaries for θ. We develop an optimization framework to learn parametric dictionaries that vary smoothly with θ. We propose two optimization approaches, (a) least squares approach, and (b) the regularized K-SVD approach. Furthermore, we analyze the variations in data induced by θ from a different yet related perspective of feature augmentation. Specifically, we extend the feature augmentation technique proposed for adaptation of discretely separable domains to continuously varying domains, and propose a Mercer kernel to account for such changes. We present experimental validation of the proposed techniques using both synthetic and real datasets. Copyright 2014 ACM.
Abnormality detection in crowded scenes plays a very important role in automatic monitoring of surveillance feeds. Here we present a novel framework for abnormality detection in crowd videos. The key idea of the appro...
详细信息
ISBN:
(纸本)1595930361
Abnormality detection in crowded scenes plays a very important role in automatic monitoring of surveillance feeds. Here we present a novel framework for abnormality detection in crowd videos. The key idea of the approach is that rarely or sparsely occurring events correspond to abnormal activities while the commonly occurring events correspond to the normal activities. Given an input video, multiple feature matrices are computed which are decomposed into their low-rank and sparse components, out of which the sparse components correspond to the abnormal activities. The approach does not require any explicit modeling of crowd behavior or training. Localization of the anomalies is obtained as a by-product of the proposed approach by doing an inverse mapping between the entries of the matrix and the pixels in the video frames. The method is very general and can be applied for both sparsely crowded as well as densely crowded scenes and it can be used to detect both global and local abnormalities. Experimental evaluation on two widely used datasets as well as some dense crowd videos downloaded from the web shows the effectiveness of the proposed approach. Comparison with several state-of-the-art crowd abnormality detection approaches show that the proposed method compares well as compared to the other approaches. Copyright is held by the authors.
A high level abstraction of the behavior a moving object can be obtained by analyzing its trajectory. However, traditional trajectories or tracklets are bound by the limitations of the underlying tracking algorithm us...
详细信息
ISBN:
(纸本)1595930361
A high level abstraction of the behavior a moving object can be obtained by analyzing its trajectory. However, traditional trajectories or tracklets are bound by the limitations of the underlying tracking algorithm used. In this paper, we propose a novel idea of detecting anomalous objects amid other moving objects in a video based on its short history. This history is defined as short local trajectory (SLT). The unique approach of generating SLTs from super-pixels belonging to a foreground object that incorporates both spatial and temporal information is the key in detection of anomaly. Additionally, the proposed trajectory extraction is robust across videos having different crowd density, occlusions, etc. Generally the trajectories of persons/objects moving at a particular region under usual conditions has certain fixed characteristics, thus we use Hidden Markov Model (HMM) for capturing the usual trajectory patterns during training. Whereas during detection, the proposed algorithm takes SLTs as observations for each super-pixel and measures its likelihood of being anomaly using the learned HMMs. Furthermore, we compute the spatial consistency measure for each SLT based on the neighboring trajectories. Thus, anomaly detected by the proposed approach is highly localized as demonstrated from the experiments conducted on two widely used anomaly datasets, namely UCSD Ped1 and UCSD Ped2. Copyright 2014 ACM.
We describe a system for active stabilization of cameras mounted on highly dynamic robots. To focus on careful performance evaluation of the stabilization algorithm, we use a camera mounted on a robotic test platform ...
详细信息
ISBN:
(纸本)1595930361
We describe a system for active stabilization of cameras mounted on highly dynamic robots. To focus on careful performance evaluation of the stabilization algorithm, we use a camera mounted on a robotic test platform that can have unknown perturbations in the horizontal plane, a commonly occurring scenario in mobile robotics. We show that the camera can be effectively stabilized using an inertial sensor and a single additional motor, without a joint position sensor. The algorithm uses an adaptive controller based on a model of the vertebrate Cerebellum for velocity stabilization, with additional drift correction. We have also developed a resolution adaptive retinal slip algorithm that is robust to motion blur. We evaluated the performance quantitatively using another high speed robot to generate repeatable sequences of large and fast movements that a gaze stabilization system can attempt to counteract. Thanks to the high-accuracy repeatability, we can make a fair comparison of algorithms for gaze stabilization. We show that the resulting system can reduce camera image motion to about one pixel per frame on average even when the platform is rotated at 200 degrees per second. As a practical application, we also demonstrate how the common task of face detection benefits from active gaze stabilization. Copyright 2014 ACM.
暂无评论