In this paper we represent a new technique to interact withthe computer in a non-tangible way. Specifically we have designed a Media Player system controller by Facial Expressions and Gestures (MP-FEG). We detect and...
详细信息
ISBN:
(纸本)9781467385640
In this paper we represent a new technique to interact withthe computer in a non-tangible way. Specifically we have designed a Media Player system controller by Facial Expressions and Gestures (MP-FEG). We detect and track one landmark point on the finger and 18 landmark points on the lips to capture the movement of the finger and the lips of the user. the movement patterns are classified into hand gestures and facial expressions using support vector machine (SVM). We have achieved similar to 98.65% and similar to 100% recognition accuracies for hand-gestures and facial expressions respectively. Occurrence of each of these actions (5 hand-gesturs and 3 facial expressions) is associated with a command to control (e.g., to select, play, pause the video) the video player. Perceptional quality analysis by user survey rates the experience of the non-tangible human-computer interaction facilited by the proposed technique as 'good'.
Instance retrieval (IR) is the problem of retrieving specific instances of a particular object, like a monument, from a collection of images. Currently, the most popular methods for IR use Bag of words (BoW) features ...
详细信息
ISBN:
(纸本)9781467385640
Instance retrieval (IR) is the problem of retrieving specific instances of a particular object, like a monument, from a collection of images. Currently, the most popular methods for IR use Bag of words (BoW) features for retrieval. However, a prominent problem for IR remains the tendency of BoW based methods to retrieve near-identical images as most relevant results. In this paper, we define diversity in IR as variation of physical properties among most relevant retrieved results for a query image. To achieve this, we propose both an ITML algorithm that re-fashions the BoW feature space into one that appreciates diversity better, and a measure to evaluate diversity in retrieval results for IR applications. Additionally, we also generate 200 hand-labeled images from the Paris dataset, for use in further research in this area. Experiments on the popular Paris dataset show that our method outperforms the standard BoW model in many cases.
One of the important requirements for a good object detector is a set of robust visual features. these features extracted from the reference images containing the desired object instance will be used to identify the o...
详细信息
ISBN:
(纸本)9781467385640
One of the important requirements for a good object detector is a set of robust visual features. these features extracted from the reference images containing the desired object instance will be used to identify the objects from the test images. In this paper, we propose a new feature set for object detection, called the Histogram of Radon Projections (HRP). To compute this feature descriptor, the image is first divided into smaller cells and for each cell, the Radon transform values are calculated for different orientations and weighted votes for each transform coefficient are accumulated into bins. these bin values are block-normalized and collected together to get the final descriptor. We use this descriptor for car detection using gray-scale images and pedestrian detection using RGB images. the performance of this descriptor is compared withthat of HOG and it is found that the new descriptor performs better for both gray-scale and RGB images.
A novel method for face recognition system using challenging profile and frontal faces is proposed in this paper. the proposed face recognition system consists of pre-processing, feature extraction and classification ...
详细信息
ISBN:
(纸本)9781467385640
A novel method for face recognition system using challenging profile and frontal faces is proposed in this paper. the proposed face recognition system consists of pre-processing, feature extraction and classification components. In this work, for pre-processing, the face region is extracted using facial landmark points, obtained by the tree structured part model. During feature extraction, SIFT descriptors are computed from the detected face region, and Spatial Pyramid Matching approach based on Locality constraints Linear Coding technique is employed for feature representation. Finally multi-class linear SVM classifier is employed to do the classification job. Extensive experimental results have been performed to show that the proposed algorithm has satisfying performance as compared to existing methods for IITK, CASIA-FACE-V5, LIBOR, ORL and Extended YALE-B face databases.
Lung tumor estimation on imaging modalities is required to assess the extent of the tumor for diagnosis. Segmentation of tumor in Cone-Beam Computed Tomography (CBCT) images is non-trivial due to its imaging artifacts...
详细信息
ISBN:
(纸本)9781467385640
Lung tumor estimation on imaging modalities is required to assess the extent of the tumor for diagnosis. Segmentation of tumor in Cone-Beam Computed Tomography (CBCT) images is non-trivial due to its imaging artifacts. Here we propose a novel technique for image registration of 18-Fluoro deoxyglucose Positron Emission Tomography (PET) and Computed Tomography(CT) images with CBCT images. the computation is performed in two stages. In the first stage, mutual information based rigid image registration is performed to obtain a rough global alignment of CBCT image withthe corresponding PET and CT images. this result is fed to the second stage to perform deformable image registration between a pair of corresponding CBCT volumes of the same patient captures at different time instances using a viscous fluid model. the technique is adapted in both 2D (for slicewise computation) and 3D space (for computing with volume), and a comparative performance is presented with a simulated deformation model.
Optic disc (OD) detection is an important step in developing computer aided screening systems suitable for glaucoma analysis. In this paper, we present a new method for automatic optic disc detection in retinal (fundu...
详细信息
ISBN:
(纸本)9781467385640
Optic disc (OD) detection is an important step in developing computer aided screening systems suitable for glaucoma analysis. In this paper, we present a new method for automatic optic disc detection in retinal (fundus) images. the method is based upon the distribution of major blood vessels. the blood vessels originate from the OD and their random distribution pattern can be approximately divided into two halves by a global symmetric axis passing through the centroid and near the optic disc. We detect this symmetry axis by using partial Hausdorff distance (PHD) measure. then, the OD center is detected by applying the brightness property of the optic disc region. the proposed method is evaluated and compared on DRIVE, STARE and HRF databases. the average performance of the proposed method is found as: 97.5% in DRIVE, 97.5% in STARE and 100% in HRF database.
Attribute-based facial image retrieval has wide range of applications, such as in law enforcement, online social networks, etc. the problem becomes more challenging if the images are from different modalities. For exa...
详细信息
ISBN:
(纸本)9781467385640
Attribute-based facial image retrieval has wide range of applications, such as in law enforcement, online social networks, etc. the problem becomes more challenging if the images are from different modalities. For example, the input is a sketch or a composite image, and the task is to retrieve photo images which have the same facial attributes as the input data. In this work, we propose a learning-based approach, in which two transformations are learnt to transform the training images from the two modalities with associated attribute annotations such that images which have similar attributes move closer to each other, and images with very different attributes move farther from each other in the transformed space. Given a query image, it is first transformed to the learnt space in which the images with similar attributes are retrieved. the same framework works seamlessly if the images to be retrieved are of same or different modality as compared to the query data. the attributes of the query image are also automatically obtained as a byproduct of the algorithm. Extensive experimental evaluation on three datasets shows the effectiveness of the proposed approach.
Human face anthropometric measurements are used in forensics, orthodontics, face modelling and many other domains, wherein distance between set of facial landmarks play an important role to make inferences. 3D facial ...
详细信息
ISBN:
(纸本)9781467385640
Human face anthropometric measurements are used in forensics, orthodontics, face modelling and many other domains, wherein distance between set of facial landmarks play an important role to make inferences. 3D facial data captured using specialized acquisition methods can be used to reduce the time and tedious task involved in order to compute these measurements. the proposed method is developed to compute fifteen canonical linear measurements between facial landmarks using Kinect camera. Results obtained from this system are compared withthe traditional method of measurement using digital Vernier caliper. the experimental results indicate that measurements using RGB-D data obtained from Kinect are good enough for a quick preliminary assessment of the subject as compared to traditional method.
In this paper, a multi-view stereo image watermarking scheme is proposed to resist the RST (rotation, scaling and translation) attack. To make the scheme resilient to RST, the coefficients of Singular Value Decomposit...
详细信息
ISBN:
(纸本)9781467385640
In this paper, a multi-view stereo image watermarking scheme is proposed to resist the RST (rotation, scaling and translation) attack. To make the scheme resilient to RST, the coefficients of Singular Value Decomposition (SVD) from both left and right views have been used for insertion of the watermark bits. 2D-DWT (Discrete wavelet transform) is used as a preprocessing step to get more correlated SVD coefficients of the left and right view such that the visual degradation due to embedding can be reduced. In this work, a blind embedding scheme is proposed by altering the selected SVD coefficients to improve the robustness of the embedding scheme. A comprehensive set of experiments have been performed to justify the robustness of the proposed scheme against RST attack. Moreover, this scheme can be used to detect the view swapping attack using DIBR technique.
this paper proposes an approach for detection and tracking of multiple objects in a video. We detect multiple objects in the frames using an improved version of the Viola-Jones face-detector, extract Speeded Up Robust...
详细信息
ISBN:
(纸本)9781467385640
this paper proposes an approach for detection and tracking of multiple objects in a video. We detect multiple objects in the frames using an improved version of the Viola-Jones face-detector, extract Speeded Up Robust Features (SURF) from the detected objects and initialize an improved version of the Kanade-Lucas-Tomashi (KLT) tracker to track the objects throughout the video. We use Gradient Weighted Optical Flow (GWOF) feature to detect boththe static and moving objects. the improvement over the KLT tracker is done using the GWOF measure, enabling the tracking system to work in videos with camera shaking. the proposed object tracking method is capable of dealing with multiple challenges like illumination changes, variable and uneven background and poor lighting condition. the efficacy of the proposed approach is tested on challenging datasets like ALOV++ and Honda/UCSD, compared to the state-of-the-art.
暂无评论