this paper proposes a novel recommendation engine to suggest coordinated outfits to the users that complements each other. the proposed recommendation model encodes subjective knowledge of clothing experts in Multimed...
详细信息
ISBN:
(纸本)9781467385640
this paper proposes a novel recommendation engine to suggest coordinated outfits to the users that complements each other. the proposed recommendation model encodes subjective knowledge of clothing experts in Multimedia Web Ontology Language (MOWL) and makes use of evidential and causal reasoning scheme to deal withthe media properties of concepts. Our approach automatically identifies the user visual personality and interprets the contextual meaning of media features of the garments in the context of input query image. As a result, personalized complementary garments based on occasion of wear are recommended to the user. We have validated our approach with garment preferences of various models with a large collection of shirts and trousers, collected from various websites.
Automatic image annotation is the computervision task of assigning a set of appropriate textual tags to a novel image. the aim is to eventually bridge the semantic gap of visual and textual representations withthe h...
详细信息
ISBN:
(纸本)9781467385640
Automatic image annotation is the computervision task of assigning a set of appropriate textual tags to a novel image. the aim is to eventually bridge the semantic gap of visual and textual representations withthe help of these tags. this also has applications in designing scalable image retrieval systems and providing multilingual interfaces. though a wide varieties of powerful machine learning algorithms have been explored for the image annotation problem in the recent past, nearest neighbor techniques still yield superior results to them. A challenge ahead of the present day annotation schemes is the lack of sufficient training data. In this paper, an active Learning based image annotation model is proposed. We leverage the image-toimage and image-to-tag similarities to decide the best set of tags describing the semantics of an image. the advantages of the proposed model includes: (a). It is able to output the variable number of tags for images which improves the accuracy. (b). It is effectively able to choose the difficult samples that needs to be manually annotated and thereby reducing the human annotation efforts. Studies on Corel and IAPR TC-12 datasets validate the effectiveness of this model.
In this paper we propose a novel real time video sailency detection algorithm based on disorder in Motion Field. the proposed algorithm operates on the basic premise that higher disorder pertains to higher information...
详细信息
ISBN:
(纸本)9781467385640
In this paper we propose a novel real time video sailency detection algorithm based on disorder in Motion Field. the proposed algorithm operates on the basic premise that higher disorder pertains to higher information in the scene. Based on the quantified value of the disorder, salient areas in the video frame are demarcated. In order to achieve real time operational capability, the algorithm operates in compressed H.264 domain rather than in pixel domain. the proposed algorithm has been evaluated on standard video sequences and results on real time video surveillance data are also presented.
this paper describes a sparse representation based approach to learn a classifier for assessing the video quality without a reference. First we calculate the natural scene statistics (NSS) based spatial features of ea...
详细信息
ISBN:
(纸本)9781467385640
this paper describes a sparse representation based approach to learn a classifier for assessing the video quality without a reference. First we calculate the natural scene statistics (NSS) based spatial features of each frame/ image and then learn a dictionary by K-SVD algorithm from NSS features of correct frames. In this work we identified the fact that correct frame can be represented precisely in terms of dictionary atoms but while representing a distorted frame, the error drastically increases with increase in distortion thus we can easily classify the frames as correct and distorted based on error score calculated by sparse representation framework. this framework has been validated on two datasets and we observe improved accuracies as compared to state-of-art algorithms.
Human authentication can now be seen as a crucial social problem. In this paper a multimodal authentication system is presented which is highly reliable and fuses iris, finger-knuckleprint and palmprint image matching...
详细信息
ISBN:
(纸本)9781467385640
Human authentication can now be seen as a crucial social problem. In this paper a multimodal authentication system is presented which is highly reliable and fuses iris, finger-knuckleprint and palmprint image matching scores. Segmented ROI are preprocessed using DCP (Differential Code Pattern) to obtain robust corner features. Later they are matched using the GOF (Global Optical Flow) based dissimilarity measure. the proposed system has been tested on Casia Interval and Lamp iris, PolyU finger-knuckle-print and PolyU and Casia palmprint, public databases. the proposed system has shown good performance over all unimodal databases while over multimodal (fusion of all three) databases it has shown perfect performance (i:e: CRR = 100% with EER = 0%).
the recent era of digitization is expected to digitized many old important documents which are degraded due to various reasons. Degraded document image binarization has many challenges like intensity variation, backgr...
详细信息
ISBN:
(纸本)9781467385640
the recent era of digitization is expected to digitized many old important documents which are degraded due to various reasons. Degraded document image binarization has many challenges like intensity variation, background contrast variation, bleed through, text size variation and so on. Many approaches are available for document image binarization, but none can handle all types of degradation at once. We proposed an approach which consists of three stages such as preprocessing, Text-Area detection and post-processing. Preprocessing enhances the contrast of the image. Next stage involves identifying Text-Area. Postprocessing technique takes care of false positives and false negative based on intensity values of preprocessed and gray image. the Performance is evaluated based on various quantitative measures and is compared withthe method regarded best so far. the algorithm is also expected to be independent of the script, hence is tested on Gujarati degraded document images.
Segmentation of cell nuclei in PAP-smear cervical images is of preeminent importance in computer-aided-diagnostic screening technique for cervical cancer. this paper proposes a novel nuclei segmentation approach which...
详细信息
ISBN:
(纸本)9781467385640
Segmentation of cell nuclei in PAP-smear cervical images is of preeminent importance in computer-aided-diagnostic screening technique for cervical cancer. this paper proposes a novel nuclei segmentation approach which builds upon the mean-shift method. the mean-shift method is applied on the cell images which first undergo a decorrelation-stretch contrast enhancement. the results of mean-shift based approach is refined further using morphological operations. We have validated results of segmentation on dataset which includes 900 images withthe given ground truth. We demonstrate that our simple and efficient approach yields high validation rate on a large image dataset. In addition, we also show encouraging visual results on another set of more complex real images.
In this paper, a fractional order total variation (TV) model is presented for estimating the optical flow in the image sequences. the proposed fractional order model is introduced by generalizing a variational flow mo...
详细信息
ISBN:
(纸本)9781467385640
In this paper, a fractional order total variation (TV) model is presented for estimating the optical flow in the image sequences. the proposed fractional order model is introduced by generalizing a variational flow model formed with a quadratic and a total variation terms. However, it is difficult to solve this generalized model due to the non-differentiability of the total variation regularization term. the Grunwald-Letnikov derivative is used to discretize the fractional order derivative. the resulting formulation is solved by using an efficient numerical algorithm. the experimental results verify that the proposed model yields a dense flow and preserves discontinuities in the flow field. Moreover, It also provides a significant robustness against outliers.
Analysis of a very long video and semantically describe the contents is a challenging task in computervision. the present approaches such as video shot detection and summarization address this problem partially while...
详细信息
ISBN:
(纸本)9781467385640
Analysis of a very long video and semantically describe the contents is a challenging task in computervision. the present approaches such as video shot detection and summarization address this problem partially while maintaining the temporal coherency. To reduce the user efforts for seeing the whole video we have introduced a new technique which combines similar content irrespective of their presence at different time instants. In this approach, we automatically identify only the representative frames corresponding to similar scenes which were captured at different instants of time. We also provide the labels of the objects that are present in the representative frames along withthe compact representation for the video. We achieve the task of semantic labelling of frames in a unified framework using a deep learning framework involving pre-trained features through a convolutional neural network. We show that the proposed approach is able to address the semantic labelling effectively as justified by the results obtained for videos of different scenes captured through different modalities.
We use the RGB-D technology of Kinect to control an application with hand-gestures. We use PowerPoint for test. the system can start/end PPT, navigate between slides, capture or release the control of the cursor, and ...
详细信息
ISBN:
(纸本)9781467385640
We use the RGB-D technology of Kinect to control an application with hand-gestures. We use PowerPoint for test. the system can start/end PPT, navigate between slides, capture or release the control of the cursor, and control it through natural gestures. Such a system is useful and hygienic in the kitchen, lavatories, hospital ICUs for touch-less surgery, and the like. the challenge is to extract meaningful gestures from continuous hand motions. We propose a system that recognizes isolated gestures from continuous hand motions for multiple gestures in real-time. Experimental results show that the system has 96.48% precision (at 96.00% recall) and performs better than the Microsoft Gesture Recognition library for swipe gestures.
暂无评论