Human authentication can now be seen as a crucial social problem. In this paper a multimodal authentication system is presented which is highly reliable and fuses iris, finger-knuckleprint and palmprint image matching...
详细信息
ISBN:
(纸本)9781467385640
Human authentication can now be seen as a crucial social problem. In this paper a multimodal authentication system is presented which is highly reliable and fuses iris, finger-knuckleprint and palmprint image matching scores. Segmented ROI are preprocessed using DCP (Differential Code Pattern) to obtain robust corner features. Later they are matched using the GOF (Global Optical Flow) based dissimilarity measure. The proposed system has been tested on Casia Interval and Lamp iris, PolyU finger-knuckle-print and PolyU and Casia palmprint, public databases. The proposed system has shown good performance over all unimodal databases while over multimodal (fusion of all three) databases it has shown perfect performance (i:e: CRR = 100% with EER = 0%).
Analysis of a very long video and semantically describe the contents is a challenging task in computervision. The present approaches such as video shot detection and summarization address this problem partially while...
详细信息
ISBN:
(纸本)9781467385640
Analysis of a very long video and semantically describe the contents is a challenging task in computervision. The present approaches such as video shot detection and summarization address this problem partially while maintaining the temporal coherency. To reduce the user efforts for seeing the whole video we have introduced a new technique which combines similar content irrespective of their presence at different time instants. In this approach, we automatically identify only the representative frames corresponding to similar scenes which were captured at different instants of time. We also provide the labels of the objects that are present in the representative frames along with the compact representation for the video. We achieve the task of semantic labelling of frames in a unified framework using a deep learning framework involving pre-trained features through a convolutional neural network. We show that the proposed approach is able to address the semantic labelling effectively as justified by the results obtained for videos of different scenes captured through different modalities.
In this paper, a fractional order total variation (TV) model is presented for estimating the optical flow in the image sequences. The proposed fractional order model is introduced by generalizing a variational flow mo...
详细信息
ISBN:
(纸本)9781467385640
In this paper, a fractional order total variation (TV) model is presented for estimating the optical flow in the image sequences. The proposed fractional order model is introduced by generalizing a variational flow model formed with a quadratic and a total variation terms. However, it is difficult to solve this generalized model due to the non-differentiability of the total variation regularization term. The Grunwald-Letnikov derivative is used to discretize the fractional order derivative. The resulting formulation is solved by using an efficient numerical algorithm. The experimental results verify that the proposed model yields a dense flow and preserves discontinuities in the flow field. Moreover, It also provides a significant robustness against outliers.
In this paper we represent a new technique to interact with the computer in a non-tangible way. Specifically we have designed a Media Player system controller by Facial Expressions and Gestures (MP-FEG). We detect and...
详细信息
ISBN:
(纸本)9781467385640
In this paper we represent a new technique to interact with the computer in a non-tangible way. Specifically we have designed a Media Player system controller by Facial Expressions and Gestures (MP-FEG). We detect and track one landmark point on the finger and 18 landmark points on the lips to capture the movement of the finger and the lips of the user. The movement patterns are classified into hand gestures and facial expressions using support vector machine (SVM). We have achieved similar to 98.65% and similar to 100% recognition accuracies for hand-gestures and facial expressions respectively. Occurrence of each of these actions (5 hand-gesturs and 3 facial expressions) is associated with a command to control (e.g., to select, play, pause the video) the video player. Perceptional quality analysis by user survey rates the experience of the non-tangible human-computer interaction facilited by the proposed technique as 'good'.
We use the RGB-D technology of Kinect to control an application with hand-gestures. We use PowerPoint for test. The system can start/end PPT, navigate between slides, capture or release the control of the cursor, and ...
详细信息
ISBN:
(纸本)9781467385640
We use the RGB-D technology of Kinect to control an application with hand-gestures. We use PowerPoint for test. The system can start/end PPT, navigate between slides, capture or release the control of the cursor, and control it through natural gestures. Such a system is useful and hygienic in the kitchen, lavatories, hospital ICUs for touch-less surgery, and the like. The challenge is to extract meaningful gestures from continuous hand motions. We propose a system that recognizes isolated gestures from continuous hand motions for multiple gestures in real-time. Experimental results show that the system has 96.48% precision (at 96.00% recall) and performs better than the Microsoft Gesture Recognition library for swipe gestures.
Instance retrieval (IR) is the problem of retrieving specific instances of a particular object, like a monument, from a collection of images. Currently, the most popular methods for IR use Bag of words (BoW) features ...
详细信息
ISBN:
(纸本)9781467385640
Instance retrieval (IR) is the problem of retrieving specific instances of a particular object, like a monument, from a collection of images. Currently, the most popular methods for IR use Bag of words (BoW) features for retrieval. However, a prominent problem for IR remains the tendency of BoW based methods to retrieve near-identical images as most relevant results. In this paper, we define diversity in IR as variation of physical properties among most relevant retrieved results for a query image. To achieve this, we propose both an ITML algorithm that re-fashions the BoW feature space into one that appreciates diversity better, and a measure to evaluate diversity in retrieval results for IR applications. Additionally, we also generate 200 hand-labeled images from the Paris dataset, for use in further research in this area. Experiments on the popular Paris dataset show that our method outperforms the standard BoW model in many cases.
A novel method for face recognition system using challenging profile and frontal faces is proposed in this paper. The proposed face recognition system consists of pre-processing, feature extraction and classification ...
详细信息
ISBN:
(纸本)9781467385640
A novel method for face recognition system using challenging profile and frontal faces is proposed in this paper. The proposed face recognition system consists of pre-processing, feature extraction and classification components. In this work, for pre-processing, the face region is extracted using facial landmark points, obtained by the tree structured part model. During feature extraction, SIFT descriptors are computed from the detected face region, and Spatial Pyramid Matching approach based on Locality constraints Linear Coding technique is employed for feature representation. Finally multi-class linear SVM classifier is employed to do the classification job. Extensive experimental results have been performed to show that the proposed algorithm has satisfying performance as compared to existing methods for IITK, CASIA-FACE-V5, LIBOR, ORL and Extended YALE-B face databases.
Lung tumor estimation on imaging modalities is required to assess the extent of the tumor for diagnosis. Segmentation of tumor in Cone-Beam Computed Tomography (CBCT) images is non-trivial due to its imaging artifacts...
详细信息
ISBN:
(纸本)9781467385640
Lung tumor estimation on imaging modalities is required to assess the extent of the tumor for diagnosis. Segmentation of tumor in Cone-Beam Computed Tomography (CBCT) images is non-trivial due to its imaging artifacts. Here we propose a novel technique for image registration of 18-Fluoro deoxyglucose Positron Emission Tomography (PET) and Computed Tomography(CT) images with CBCT images. The computation is performed in two stages. In the first stage, mutual information based rigid image registration is performed to obtain a rough global alignment of CBCT image with the corresponding PET and CT images. This result is fed to the second stage to perform deformable image registration between a pair of corresponding CBCT volumes of the same patient captures at different time instances using a viscous fluid model. The technique is adapted in both 2D (for slicewise computation) and 3D space (for computing with volume), and a comparative performance is presented with a simulated deformation model.
One of the important requirements for a good object detector is a set of robust visual features. These features extracted from the reference images containing the desired object instance will be used to identify the o...
详细信息
ISBN:
(纸本)9781467385640
One of the important requirements for a good object detector is a set of robust visual features. These features extracted from the reference images containing the desired object instance will be used to identify the objects from the test images. In this paper, we propose a new feature set for object detection, called the Histogram of Radon Projections (HRP). To compute this feature descriptor, the image is first divided into smaller cells and for each cell, the Radon transform values are calculated for different orientations and weighted votes for each transform coefficient are accumulated into bins. These bin values are block-normalized and collected together to get the final descriptor. We use this descriptor for car detection using gray-scale images and pedestrian detection using RGB images. The performance of this descriptor is compared with that of HOG and it is found that the new descriptor performs better for both gray-scale and RGB images.
Attribute-based facial image retrieval has wide range of applications, such as in law enforcement, online social networks, etc. The problem becomes more challenging if the images are from different modalities. For exa...
详细信息
ISBN:
(纸本)9781467385640
Attribute-based facial image retrieval has wide range of applications, such as in law enforcement, online social networks, etc. The problem becomes more challenging if the images are from different modalities. For example, the input is a sketch or a composite image, and the task is to retrieve photo images which have the same facial attributes as the input data. In this work, we propose a learning-based approach, in which two transformations are learnt to transform the training images from the two modalities with associated attribute annotations such that images which have similar attributes move closer to each other, and images with very different attributes move farther from each other in the transformed space. Given a query image, it is first transformed to the learnt space in which the images with similar attributes are retrieved. The same framework works seamlessly if the images to be retrieved are of same or different modality as compared to the query data. The attributes of the query image are also automatically obtained as a byproduct of the algorithm. Extensive experimental evaluation on three datasets shows the effectiveness of the proposed approach.
暂无评论