Attribute-based facial image retrieval has wide range of applications, such as in law enforcement, online social networks, etc. The problem becomes more challenging if the images are from different modalities. For exa...
详细信息
ISBN:
(纸本)9781467385640
Attribute-based facial image retrieval has wide range of applications, such as in law enforcement, online social networks, etc. The problem becomes more challenging if the images are from different modalities. For example, the input is a sketch or a composite image, and the task is to retrieve photo images which have the same facial attributes as the input data. In this work, we propose a learning-based approach, in which two transformations are learnt to transform the training images from the two modalities with associated attribute annotations such that images which have similar attributes move closer to each other, and images with very different attributes move farther from each other in the transformed space. Given a query image, it is first transformed to the learnt space in which the images with similar attributes are retrieved. The same framework works seamlessly if the images to be retrieved are of same or different modality as compared to the query data. The attributes of the query image are also automatically obtained as a byproduct of the algorithm. Extensive experimental evaluation on three datasets shows the effectiveness of the proposed approach.
Optic disc (OD) detection is an important step in developing computer aided screening systems suitable for glaucoma analysis. In this paper, we present a new method for automatic optic disc detection in retinal (fundu...
详细信息
ISBN:
(纸本)9781467385640
Optic disc (OD) detection is an important step in developing computer aided screening systems suitable for glaucoma analysis. In this paper, we present a new method for automatic optic disc detection in retinal (fundus) images. The method is based upon the distribution of major blood vessels. The blood vessels originate from the OD and their random distribution pattern can be approximately divided into two halves by a global symmetric axis passing through the centroid and near the optic disc. We detect this symmetry axis by using partial Hausdorff distance (PHD) measure. Then, the OD center is detected by applying the brightness property of the optic disc region. The proposed method is evaluated and compared on DRIVE, STARE and HRF databases. The average performance of the proposed method is found as: 97.5% in DRIVE, 97.5% in STARE and 100% in HRF database.
In this paper, a multi-view stereo image watermarking scheme is proposed to resist the RST (rotation, scaling and translation) attack. To make the scheme resilient to RST, the coefficients of Singular Value Decomposit...
详细信息
ISBN:
(纸本)9781467385640
In this paper, a multi-view stereo image watermarking scheme is proposed to resist the RST (rotation, scaling and translation) attack. To make the scheme resilient to RST, the coefficients of Singular Value Decomposition (SVD) from both left and right views have been used for insertion of the watermark bits. 2D-DWT (Discrete wavelet transform) is used as a preprocessing step to get more correlated SVD coefficients of the left and right view such that the visual degradation due to embedding can be reduced. In this work, a blind embedding scheme is proposed by altering the selected SVD coefficients to improve the robustness of the embedding scheme. A comprehensive set of experiments have been performed to justify the robustness of the proposed scheme against RST attack. Moreover, this scheme can be used to detect the view swapping attack using DIBR technique.
Human face anthropometric measurements are used in forensics, orthodontics, face modelling and many other domains, wherein distance between set of facial landmarks play an important role to make inferences. 3D facial ...
详细信息
ISBN:
(纸本)9781467385640
Human face anthropometric measurements are used in forensics, orthodontics, face modelling and many other domains, wherein distance between set of facial landmarks play an important role to make inferences. 3D facial data captured using specialized acquisition methods can be used to reduce the time and tedious task involved in order to compute these measurements. The proposed method is developed to compute fifteen canonical linear measurements between facial landmarks using Kinect camera. Results obtained from this system are compared with the traditional method of measurement using digital Vernier caliper. The experimental results indicate that measurements using RGB-D data obtained from Kinect are good enough for a quick preliminary assessment of the subject as compared to traditional method.
Handwritten character recognition has various potential in the field of document imageprocessing. It is one of the important aspects for systems like handwritten optical character recognizer, writer identification/ve...
详细信息
ISBN:
(纸本)9781467385640
Handwritten character recognition has various potential in the field of document imageprocessing. It is one of the important aspects for systems like handwritten optical character recognizer, writer identification/verification, automatic document sorter etc. In Bangla only few attempts are made towards character recognition. In this current study a relatively new attempt is made towards finding the dependency of writer information on character recognition by varying the inputs. This study will provide a better understanding of the input data for character recognition. Also it will help to know the Bangla characters better for writer identification/verification. Here, highest accuracy of 100% is achieved in case of numeral 7 applying LibSVM classifier.
This paper proposes an approach for detection and tracking of multiple objects in a video. We detect multiple objects in the frames using an improved version of the Viola-Jones face-detector, extract Speeded Up Robust...
详细信息
ISBN:
(纸本)9781467385640
This paper proposes an approach for detection and tracking of multiple objects in a video. We detect multiple objects in the frames using an improved version of the Viola-Jones face-detector, extract Speeded Up Robust Features (SURF) from the detected objects and initialize an improved version of the Kanade-Lucas-Tomashi (KLT) tracker to track the objects throughout the video. We use Gradient Weighted Optical Flow (GWOF) feature to detect both the static and moving objects. The improvement over the KLT tracker is done using the GWOF measure, enabling the tracking system to work in videos with camera shaking. The proposed object tracking method is capable of dealing with multiple challenges like illumination changes, variable and uneven background and poor lighting condition. The efficacy of the proposed approach is tested on challenging datasets like ALOV++ and Honda/UCSD, compared to the state-of-the-art.
In this paper, an uncompressed domain video watermarking scheme resilient to temporal adaptation is proposed for scalable video coding. In the proposed scheme, each temporal layer has been separately embedded with a d...
详细信息
ISBN:
(纸本)9781467385640
In this paper, an uncompressed domain video watermarking scheme resilient to temporal adaptation is proposed for scalable video coding. In the proposed scheme, each temporal layer has been separately embedded with a different watermark which is generated by DCT domain decomposition of a single watermark image. A zigzag sequence of block wise DCT coefficients of the watermark image is partitioned into non-overlapping sets and each set is embedded separately into different temporal layers. The base layer is embedded with the first set of DCT coefficient (which includes DC coefficient of each block) and successive layers are embedded with successive nonoverlapping coefficient sets. The coefficients of each set is chosen in such a fashion that uniform energy distribution across all temporal layers can be maintained. Experimental results show that the proposed scheme is robust against temporal scalability and robustness of the watermark increases with the addition of successive enhancement layers.
Conventionally, High Dynamic Range (HDR) images are generated by fusing multiple exposure Low Dynamic Range (LDR) images, where the HDR output often suffers from artifacts due to misalignment of camera and presence of...
详细信息
ISBN:
(纸本)9781467385640
Conventionally, High Dynamic Range (HDR) images are generated by fusing multiple exposure Low Dynamic Range (LDR) images, where the HDR output often suffers from artifacts due to misalignment of camera and presence of dynamic objects in the scene. An efficient approach to overcome these issues is to use single shot HDR imaging. In this paper, we propose a method for generating an HDR image from a single LDR image. We first generate multiple exposures of the given scene using histogram separation by adopting varying bin sizes. The resulting LDR images are fused making use of the quality measures such as contrast, saturation and well - exposedness. The results show the effectiveness of the proposed approach which is verified qualitatively and in terms of various quantitative measures.
This paper proposes a new algorithm for restoration of gray scale images corrupted by salt and pepper noise(SPN). The proposed algoritm identifies a pixel as noisy if its intensity value is 0 or 255 and processes it u...
详细信息
ISBN:
(纸本)9781467385640
This paper proposes a new algorithm for restoration of gray scale images corrupted by salt and pepper noise(SPN). The proposed algoritm identifies a pixel as noisy if its intensity value is 0 or 255 and processes it using pixels in a 3 x 3 window. If the window consists of noisy and non-noisy pixels, then the pixel to be processed is replaced with the trimmed median value of the non-noisy pixels. However, if only noisy pixels are there in the window then their mean value is used to process the pixel. The proposed method uses processed (i.e. the de-noised) pixels in the window while processing the noisy pixels and shows significantly better performance, particularly at high noise density, as compared to various methods reported in literature. Experimental results show improvements both visually and quantitatively compared to other reported methods.
Purely, data-driven large scale image classification has been achieved using various feature descriptors like SIFT, HOG etc. Major milestone in this regards is Convolutional Neural Networks (CNN) based methods which l...
详细信息
ISBN:
(纸本)9781467385640
Purely, data-driven large scale image classification has been achieved using various feature descriptors like SIFT, HOG etc. Major milestone in this regards is Convolutional Neural Networks (CNN) based methods which learn optimal feature descriptors as filters. Little attention has been given to the use of domain knowledge. Ontology plays an important role in learning to categorize images into abstract classes where there may not be a clear visual connect between category and image, for example identifying image mood - happy, sad and neutral. Our algorithm combines CNN and ontology priors to infer abstract patterns in indian Monument images. We use a transfer learning based approach in which, knowledge of domain is transferred to CNN while training (top down transfer) and inference is made using CNN prediction and ontology tree/priors (bottom up transfer). We classify images to categories like Tomb, Fort and Mosque. We demonstrate that our method improves remarkably over logistic classifier and other transfer learning approach. We conclude with a remark on possible applications of the model and note about scaling this to bigger ontology.
暂无评论