We have developed a novel method for image abstraction which preserves more details present in the salient regions and removes details present in the non- salient regions from the given image of a natural scene. We de...
详细信息
ISBN:
(纸本)9781467385640
We have developed a novel method for image abstraction which preserves more details present in the salient regions and removes details present in the non- salient regions from the given image of a natural scene. We define a region to be salient based on the saliency measure estimated in the region. We propose to preserve details in salient regions by dividing them into smaller groups of pixels and remove details from non- salient regions by dividing them in larger group of pixels. We achieve this kind of grouping by guiding an over- segmentation algorithm with spatially varying block size depending on the saliency measure. the adaptive image abstraction goal is finally achieved using a novel brush called point spread brush which is used to reproduce the action of brush with a varying spatial spread.
In this paper, we consider models of filter banks with variable time-frequency resolution, adapting to signal properties and human perception. the application of the proposed methods to image and audio denoising is de...
详细信息
In this paper, we consider models of filter banks with variable time-frequency resolution, adapting to signal properties and human perception. the application of the proposed methods to image and audio denoising is demonstrated.
this paper proposes a novel recommendation engine to suggest coordinated outfits to the users that complements each other. the proposed recommendation model encodes subjective knowledge of clothing experts in Multimed...
详细信息
ISBN:
(纸本)9781467385640
this paper proposes a novel recommendation engine to suggest coordinated outfits to the users that complements each other. the proposed recommendation model encodes subjective knowledge of clothing experts in Multimedia Web Ontology Language (MOWL) and makes use of evidential and causal reasoning scheme to deal withthe media properties of concepts. Our approach automatically identifies the user visual personality and interprets the contextual meaning of media features of the garments in the context of input query image. As a result, personalized complementary garments based on occasion of wear are recommended to the user. We have validated our approach with garment preferences of various models with a large collection of shirts and trousers, collected from various websites.
Automatic image annotation is the computervision task of assigning a set of appropriate textual tags to a novel image. the aim is to eventually bridge the semantic gap of visual and textual representations withthe h...
详细信息
ISBN:
(纸本)9781467385640
Automatic image annotation is the computervision task of assigning a set of appropriate textual tags to a novel image. the aim is to eventually bridge the semantic gap of visual and textual representations withthe help of these tags. this also has applications in designing scalable image retrieval systems and providing multilingual interfaces. though a wide varieties of powerful machine learning algorithms have been explored for the image annotation problem in the recent past, nearest neighbor techniques still yield superior results to them. A challenge ahead of the present day annotation schemes is the lack of sufficient training data. In this paper, an active Learning based image annotation model is proposed. We leverage the image-toimage and image-to-tag similarities to decide the best set of tags describing the semantics of an image. the advantages of the proposed model includes: (a). It is able to output the variable number of tags for images which improves the accuracy. (b). It is effectively able to choose the difficult samples that needs to be manually annotated and thereby reducing the human annotation efforts. Studies on Corel and IAPR TC-12 datasets validate the effectiveness of this model.
this paper describes a sparse representation based approach to learn a classifier for assessing the video quality without a reference. First we calculate the natural scene statistics (NSS) based spatial features of ea...
详细信息
ISBN:
(纸本)9781467385640
this paper describes a sparse representation based approach to learn a classifier for assessing the video quality without a reference. First we calculate the natural scene statistics (NSS) based spatial features of each frame/ image and then learn a dictionary by K-SVD algorithm from NSS features of correct frames. In this work we identified the fact that correct frame can be represented precisely in terms of dictionary atoms but while representing a distorted frame, the error drastically increases with increase in distortion thus we can easily classify the frames as correct and distorted based on error score calculated by sparse representation framework. this framework has been validated on two datasets and we observe improved accuracies as compared to state-of-art algorithms.
We present an improved mesh denoising method based on 3D geometric bilateral filtering. Its novelty is that it can preserve the details of the object as well as reduce the noise in an effective manner. the previous ap...
详细信息
ISBN:
(纸本)9781479915880
We present an improved mesh denoising method based on 3D geometric bilateral filtering. Its novelty is that it can preserve the details of the object as well as reduce the noise in an effective manner. the previous approach of geometric bilateral filtering for 3D-scan points has a limitation that it reduces the point density, thereby losing the details present in the object. the approach proposed by us, on the contrary, works on the surface mesh obtained after triangulating the 3D-scan points without any data downsampling. Each vertex of the mesh is repositioned appropriately based on the estimated centroid of the vertices in its local neighborhood and a Gaussian weight function. Experimental results demonstrate its strength, efficiency, and robustness.
In many common applications of Microsoft Kinect (TM) including navigation, surveillance, 3D reconstruction, and the like;it is required to estimate the geometry of mirrors or other reflecting surfaces existing in the ...
详细信息
ISBN:
(纸本)9781479915880
In many common applications of Microsoft Kinect (TM) including navigation, surveillance, 3D reconstruction, and the like;it is required to estimate the geometry of mirrors or other reflecting surfaces existing in the field of view. this often is difficult as in most positions a mirror does not support diffuse reflection of speckles and hence cannot be seen in the Kinect depth map. A mirror shows up as unknown depth. However, suitably placed objects reflecting in the mirror can provide important clues for the orientation and distance of the mirror. In this paper we present a method using a ball and its mirror image to set-up point-to-point correspondence between object and image points to solve for the geometry of the mirror. Withthis simple estimators are designed for the orientation and distance of a plane vertical mirror with respect to the Kinect camera. In addition an estimator is presented for the diameter of the ball. the estimators are validated through a set of experiments.
Text recognition from a natural scene and video is challenging compared to that in scanned document images. this is due to the problems of text on different sources of various styles, font variation, font size variati...
详细信息
ISBN:
(纸本)9781479915880
Text recognition from a natural scene and video is challenging compared to that in scanned document images. this is due to the problems of text on different sources of various styles, font variation, font size variations, background variations, etc. there are approaches for word segmentation from video and scene images to feed the word image into OCRs. Nevertheless, such methods often fail to yield satisfactory results in recognition. therefore, in this paper, we propose to combine Hidden Markov Model (HMM) and Convolutional Neural Network (CNN) to achieve good recognition rate. Sequential gradient features with HMM help to find character alignment of a word. Later the character alignments are verified by Convolutional Neural network (CNN). the approach is tested on both video and scene data to show the effectiveness of the proposed approach. the results are found encouraging.
Segmentation of cell nuclei in PAP-smear cervical images is of preeminent importance in computer-aided-diagnostic screening technique for cervical cancer. this paper proposes a novel nuclei segmentation approach which...
详细信息
ISBN:
(纸本)9781467385640
Segmentation of cell nuclei in PAP-smear cervical images is of preeminent importance in computer-aided-diagnostic screening technique for cervical cancer. this paper proposes a novel nuclei segmentation approach which builds upon the mean-shift method. the mean-shift method is applied on the cell images which first undergo a decorrelation-stretch contrast enhancement. the results of mean-shift based approach is refined further using morphological operations. We have validated results of segmentation on dataset which includes 900 images withthe given ground truth. We demonstrate that our simple and efficient approach yields high validation rate on a large image dataset. In addition, we also show encouraging visual results on another set of more complex real images.
Accurate detection of optic disk and macula are of interest in automated analysis of retinal images as they are landmarks in retina and their detection aids in assessing the severity of diseases based on the locations...
详细信息
ISBN:
(纸本)9781467385640
Accurate detection of optic disk and macula are of interest in automated analysis of retinal images as they are landmarks in retina and their detection aids in assessing the severity of diseases based on the locations of abnormalities relative to these landmarks. the general strategy is to design different methods to these landmarks. In contrast, we propose a novel and unified approach for Optic disk and macula detection in this paper using the Generalized Motion Pattern (GMP) [10] [19] which is derived by inducing motion to an image to smooth out unwanted information. the proposed method is unsupervised, parallelizable and handles illumination differences efficiently but assumes a fixed protocol in image acquisition. the proposed method has been tested on five public datasets and obtained results indicate comparable performance to supervised approaches for the same problem.
暂无评论