Analysis of a very long video and semantically describe the contents is a challenging task in computervision. the present approaches such as video shot detection and summarization address this problem partially while...
详细信息
ISBN:
(纸本)9781467385640
Analysis of a very long video and semantically describe the contents is a challenging task in computervision. the present approaches such as video shot detection and summarization address this problem partially while maintaining the temporal coherency. To reduce the user efforts for seeing the whole video we have introduced a new technique which combines similar content irrespective of their presence at different time instants. In this approach, we automatically identify only the representative frames corresponding to similar scenes which were captured at different instants of time. We also provide the labels of the objects that are present in the representative frames along withthe compact representation for the video. We achieve the task of semantic labelling of frames in a unified framework using a deep learning framework involving pre-trained features through a convolutional neural network. We show that the proposed approach is able to address the semantic labelling effectively as justified by the results obtained for videos of different scenes captured through different modalities.
We use the RGB-D technology of Kinect to control an application with hand-gestures. We use PowerPoint for test. the system can start/end PPT, navigate between slides, capture or release the control of the cursor, and ...
详细信息
ISBN:
(纸本)9781467385640
We use the RGB-D technology of Kinect to control an application with hand-gestures. We use PowerPoint for test. the system can start/end PPT, navigate between slides, capture or release the control of the cursor, and control it through natural gestures. Such a system is useful and hygienic in the kitchen, lavatories, hospital ICUs for touch-less surgery, and the like. the challenge is to extract meaningful gestures from continuous hand motions. We propose a system that recognizes isolated gestures from continuous hand motions for multiple gestures in real-time. Experimental results show that the system has 96.48% precision (at 96.00% recall) and performs better than the Microsoft Gesture Recognition library for swipe gestures.
When taking photos in dim-light environments, due to the small amount of light entering, the shot images are usually extremely dark, with a great deal of noise, and the color cannot reflect real-world color. Under thi...
详细信息
ISBN:
(纸本)9789897584022
When taking photos in dim-light environments, due to the small amount of light entering, the shot images are usually extremely dark, with a great deal of noise, and the color cannot reflect real-world color. Under this condition, the traditional methods used for single image denoising have always failed to be effective. One common idea is to take multiple frames of the same scene to enhance the signal-to-noise ratio. this paper proposes a recurrent fully convolutional network (RFCN) to process burst photos taken under extremely low-light conditions, and to obtain denoised images with improved brightness. Our model maps raw burst images directly to sRGB outputs, either to produce a best image or to generate a multi-frame denoised image sequence. this process has proven to be capable of accomplishing the low-level task of denoising, as well as the high-level task of color correction and enhancement, all of which is end-to-end processingthrough our network. Our method has achieved better results than state-of-the-art methods. In addition, we have applied the model trained by one type of camera without fine-tuning on photos captured by different cameras and have obtained similar end-to-end enhancements.
this paper presents a novel approach to computational art focusing on mandalas-an iconic heritage of indian art that has proliferated significantly in recent times. Our innovative software allows users to input a hand...
详细信息
ISBN:
(纸本)9798400710759
this paper presents a novel approach to computational art focusing on mandalas-an iconic heritage of indian art that has proliferated significantly in recent times. Our innovative software allows users to input a handcrafted mandala and select specific motifs for error rectification. the rectification leverages vector information for geometric discretization alongside a simple GCD rule, adhering to traditional mandala principles. the rectified image, being in vector form, allows various real-time operations in the vector space, such as insertion, deletion, or modification of motifs or layers, facilitating enhancements, enlargements, or creation of new compositions. We demonstrate the software's merit and versatility through various examples, highlighting its potential for special applications such as digital artistry with mandalas, including teaching and training. this work not only advances the field of computational art but also promises to preserve and enhance the rich tradition of mandala art through modern technology.
Microaneurysms are small red dots that occur on the retina during preliminary stage of Diabetic Retinopathy. computer aided microaneurysm screening is necessary to prevent the aggravation of the disease and further vi...
详细信息
ISBN:
(纸本)9781467385640
Microaneurysms are small red dots that occur on the retina during preliminary stage of Diabetic Retinopathy. computer aided microaneurysm screening is necessary to prevent the aggravation of the disease and further vision loss. In this paper, Shannon and Tsallis entropy thresholding in conjunction with Naive Bayes classifier is suggested for microaneurysm detection. Various shape and intensity based features are extracted to eliminate the falsely detected candidates. the proposed method is evaluated by plotting the FROC curves using the Retinopathy Online Challenge (ROC) and DIARETDB1 databases. the proposed method achieves high sensitivity values of 0.421 and 0.477 (at false positive rate of 8) using Shannon and Tsallis entropy thresholding which is better than some existing methods.
image dehazing either using single visible image or using visible and near-infrared (NIR) image pair has seen growing interest in last decade for improving visibility in landscape photographs. In this paper, we propos...
详细信息
ISBN:
(纸本)9781467385640
image dehazing either using single visible image or using visible and near-infrared (NIR) image pair has seen growing interest in last decade for improving visibility in landscape photographs. In this paper, we propose a novel approach for image dehazing scheme using a pair of visible and NIR images. the dehazing mechanism estimates depth map and airlight color using the visible-NIR scene statistics and uses them to form a haze-free image. Experiments on a variety of hazy images demonstrate that our method achieves high degree of detail recovery over the existing image dehazing algorithms. the resultant images exhibit a very good blend of details, contrast and color. the proposed algorithm is less computationally demanding and is fully automatic. the results are superior in both visual as well as quantitative analysis compared to state-of-the-art image dehazing algorithms.
In this paper we represent a new technique to interact withthe computer in a non-tangible way. Specifically we have designed a Media Player system controller by Facial Expressions and Gestures (MP-FEG). We detect and...
详细信息
ISBN:
(纸本)9781467385640
In this paper we represent a new technique to interact withthe computer in a non-tangible way. Specifically we have designed a Media Player system controller by Facial Expressions and Gestures (MP-FEG). We detect and track one landmark point on the finger and 18 landmark points on the lips to capture the movement of the finger and the lips of the user. the movement patterns are classified into hand gestures and facial expressions using support vector machine (SVM). We have achieved similar to 98.65% and similar to 100% recognition accuracies for hand-gestures and facial expressions respectively. Occurrence of each of these actions (5 hand-gesturs and 3 facial expressions) is associated with a command to control (e.g., to select, play, pause the video) the video player. Perceptional quality analysis by user survey rates the experience of the non-tangible human-computer interaction facilited by the proposed technique as 'good'.
this paper presents an optimized and efficient video stabilization technique based on projection curve warping. In most of the recorded videos, the relative displacement between two consecutive frames goes from 3-4 pi...
详细信息
ISBN:
(纸本)9781479915880
this paper presents an optimized and efficient video stabilization technique based on projection curve warping. In most of the recorded videos, the relative displacement between two consecutive frames goes from 3-4 pixel for hand-held and 25-30 for moving platform applications. Based on this experimental data, the use of Sakoe-Chiba band with fixed window size has been proposed for constraining distance matrix estimation, in the dynamic time warping algorithm. In the existing projection based stabilization techniques, intensity values are matched for motion estimation. Any change in the local intensity values either induced due to intensity variation, moving objects or scene variation, causes error in the estimated motion. To overcome this problem, a higher level feature i.e. shape of the projection curve has been incorporated by matching the local derivative of curve instead of the intensity values itself. Robustness and time efficiency of the proposed technique is measured in terms of interframe transformation fidelity and processing time respectively.
In recent times, there has been a sharp increase in dengue and malaria, especially in urban areas. One of the major reasons for this health hazard is the number of locations where one can find stagnant water. these lo...
详细信息
ISBN:
(纸本)9781467385640
In recent times, there has been a sharp increase in dengue and malaria, especially in urban areas. One of the major reasons for this health hazard is the number of locations where one can find stagnant water. these locations are large breeding ground for fast multiplying mosquitoes, and other insects. Areas include traditionally uncovered gutters, and also terraces of high rise buildings, and shades above windows (popularly known as chhajja)-areas that are hard to reach and access. In this paper we propose the use of a quadcopter to inspect such areas and identify stagnant water patches. Water being specular in nature tends to confound traditional imageprocessing methods. Further the use of a non-traditional camera mounted on a quadcopter presents new challenges. We provide methods to get past such hurdles.
A novel method for face recognition system using challenging profile and frontal faces is proposed in this paper. the proposed face recognition system consists of pre-processing, feature extraction and classification ...
详细信息
ISBN:
(纸本)9781467385640
A novel method for face recognition system using challenging profile and frontal faces is proposed in this paper. the proposed face recognition system consists of pre-processing, feature extraction and classification components. In this work, for pre-processing, the face region is extracted using facial landmark points, obtained by the tree structured part model. During feature extraction, SIFT descriptors are computed from the detected face region, and Spatial Pyramid Matching approach based on Locality constraints Linear Coding technique is employed for feature representation. Finally multi-class linear SVM classifier is employed to do the classification job. Extensive experimental results have been performed to show that the proposed algorithm has satisfying performance as compared to existing methods for IITK, CASIA-FACE-V5, LIBOR, ORL and Extended YALE-B face databases.
暂无评论