In this paper, we propose a novel framework for automated analysis of surveillance videos. By analysis, we imply summarizing and mining of the information in the video for learning usual patterns and discovering unusu...
详细信息
ISBN:
(纸本)9781424442195
In this paper, we propose a novel framework for automated analysis of surveillance videos. By analysis, we imply summarizing and mining of the information in the video for learning usual patterns and discovering unusual ones. We approach this video analysis problem by acknowledging that a video contains information at multiple levels and in multiple attributes. Each such component and co-occurrences of these component values play an important role in characterizing an event as usual or unusual. therefore, we cluster the video data at multiple levels of abstraction and in multiple attributes and view these clusters as a summary of the information in the video. We apply cluster algebra to mine this summary from multiple perspectives and to adapt association learning for automated selection of components because of which the event is unusual. We also propose a novel incremental clustering algorithm.
Video matting is an extension of image matting and is used to extract the foreground matte from an arbitrary background of every frame in a video sequence. An automatic scribbling approach based on the relative motion...
详细信息
ISBN:
(纸本)9781450347532
Video matting is an extension of image matting and is used to extract the foreground matte from an arbitrary background of every frame in a video sequence. An automatic scribbling approach based on the relative motion of the foreground object with respect to the background in a video is introduced for video matting. the proposed scribble propagation and the subsequent isolation of foreground and background is much more intuitive than the conventional trimap propagation approach used for video matting. Alpha maps are propagated according to the optical flow estimated from the consecutive frames to get a preliminary estimate of the foreground and background in the following frame. Accurate scribbles are placed near the boundary of the foreground region for re fining the scribbled image withthe help of morphological operations. We show that a high quality matte of foreground object can be obtained using a state-of-the-art image matting technique. We show that the results obtained using the proposed method are accurate and comparable withthat of other state-of-the-art video matting techniques.
Given a set of sequential exposures, High Dynamic Range imaging is a popular method for obtaining high-quality images for fairly static scenes. However, this typically suffers from ghosting artifacts for scenes with s...
详细信息
ISBN:
(纸本)9781450347532
Given a set of sequential exposures, High Dynamic Range imaging is a popular method for obtaining high-quality images for fairly static scenes. However, this typically suffers from ghosting artifacts for scenes with significant motion. Also, existing techniques cannot handle heavily saturated regions in the sequence. In this paper, we propose an approach that handles boththe issues mentioned above. We achieve robustness to motion (both object and camera) and saturation via an energy minimization formulation with spatio-temporal constraints. the proposed approach leverages information from the neighborhood of heavily saturated regions to correct such regions. the experimental results demonstrate the superiority of our method over state-of-the-art techniques for a variety of challenging dynamic scenes.
the problem of tracking ball in a soccer video is challenging because of sudden change in speed and orientation of the soccer ball. Successful tracking in such a scenario depends on the ability of the algorithm to bal...
详细信息
ISBN:
(纸本)9781450347532
the problem of tracking ball in a soccer video is challenging because of sudden change in speed and orientation of the soccer ball. Successful tracking in such a scenario depends on the ability of the algorithm to balance prior constraints continuously against the evidence garnered from the sequences of images. this paper proposes a particle filter based algorithm that tracks the ball when it changes its direction suddenly or takes high speed. Exact, deterministic tracking algorithms based on discretized functional, suffer from severe limitations in the form of prior constraints. Our tracking algorithm has shown excellent result even for partial occlusion which is a major concern in soccer video. We have shown that the proposed tracking algorithm is at least 7.2% better compared to competing approaches for soccer ball tracking.
Dopaminergic imaging using Single Photon Emission Computed Tomography (SPECT) with I-123-Ioflupane have shown to increase the diagnostic accuracy in Parkinson's Disease (PD). Studies show that around 10% of subjec...
详细信息
ISBN:
(纸本)9781479915880
Dopaminergic imaging using Single Photon Emission Computed Tomography (SPECT) with I-123-Ioflupane have shown to increase the diagnostic accuracy in Parkinson's Disease (PD). Studies show that around 10% of subjects who are clinically diagnosed as PD, have SPECT scans in the normal range and are called Scans Without Evidence of Dopaminergic Deficit (SWEDD) subjects. Subsequent follow-up on these subjects has indicated that they are unlikely to have PD. Detection and differentiation of PD and SWEDD is problematic in the early stages of the disease. Early and accurate diagnosis of PD and also SWEDD is crucial for early management, avoidance of unnecessary medical examinations and therapies;and their side-effects. We in our paper, use the SPECT images from 35 Normal, 36 PD and 38 SWEDD subjects as obtained from the Parkinson's Progression Markers Initiative (PPMI) database, to carry out intensity-based surface fitting using polynomial model. this is the first time that such kind of modeling is carried out on the SPECT images for the characterization of PD. Our results show that the surface profile in terms of model coefficients and goodness-of-fit parameters is different for Normal, Early PD and SWEDD subjects. Such kind of modeling may aid in the diagnosis of early PD and SWEDD from SPECT images.
We propose a framework for synthesis of natural semi cursive handwritten Latin script that can find application in text personalization, or in generation of synthetic data for recognition systems. Our method is based ...
详细信息
ISBN:
(纸本)9781450347532
We propose a framework for synthesis of natural semi cursive handwritten Latin script that can find application in text personalization, or in generation of synthetic data for recognition systems. Our method is based on the generation of synthetic n-gram letter glyphs and their subsequent concatenation. We propose a non-parametric data driven generation scheme that is able to mimic the variation observed in handwritten glyph samples to synthesize natural looking synthetic glyphs. these synthetic glyphs are then stitched together to form complete words, using a spline based concatenation scheme. Further, as a refinement, our method is able to generate pen-lifts, giving our results a natural semi cursive look. through subjective experiments and detailed analysis of the results, we demonstrate the effectiveness of our formulation in being able to generate natural looking synthetic script.
An intrinsic property of real aperture imaging has been that the observations tend to be defocused. this artifact has been used in an innovative manner by researchers for depth estimation, since the amount of defocus ...
详细信息
An intrinsic property of real aperture imaging has been that the observations tend to be defocused. this artifact has been used in an innovative manner by researchers for depth estimation, since the amount of defocus varies with varying depth in the scene. there have been various methods to model the defocus blur. We model the defocus process using the model of diffusion of heat. the diffusion process has been traditionally used in low level vision problems like smoothing, segmentation and edge detection. In this paper a novel application of the diffusion principle is made for generating the defocus space of the scene. the defocus space is the set of all possible observations for a given scene that can be captured using a physical lens system. Using the notion of defocus space we estimate the depth in the scene and also generate the corresponding fully focused equivalent pin-hole image. the algorithm described here also brings out the equivalence of the two modalities, viz. depth from focus and depth from defocus for structure recovery. (c) 2006 Elsevier B.V. All rights reserved.
In this paper we will consider a new scheme of image database retrieval by fast Hermite projection method. the database contained 4100 images. the method is based on an expansion into series of eigenfunctions of the F...
详细信息
In this paper we will consider a new scheme of image database retrieval by fast Hermite projection method. the database contained 4100 images. the method is based on an expansion into series of eigenfunctions of the Fourier transform. Photo normalization includes following steps of preprocessing: resampling, corners detection, rotation, perspective and parallelogram elimination, painting cutting, ranging and color plane elimination. the searching is based on the database query by fast Hermite coefficients and retrieving from the database nearest record by quadratic discrepancy.
Video based multimedia services are showing major growth in recent years. though the video coding recommendation ITU-T H.265/HEVC is in operating state, the majority of real-time video applications including video con...
详细信息
ISBN:
(纸本)9781467385640
Video based multimedia services are showing major growth in recent years. though the video coding recommendation ITU-T H.265/HEVC is in operating state, the majority of real-time video applications including video conferencing and live streaming of events rely mainly on less computationally intensive H.264/AVC coding standard which is well established among video industry. Rate control algorithms are indispensable for delivering superior quality video over limited bandwidth connections. In this paper, we propose a rate control technique which provides consistent quality video output over time in a limited bandwidth video conferencing scenario. Using the concept of video traffic prediction and linear complexity model, we compute the complexity of the video sequence in real time. We present a better estimate of quantization parameter at the GOP (group of picture) layer. Our proposed method maintains similar quality for reconstructed video when compared to rate control scheme of JM 19.0 implemented in real-time video conferencing system with restricted quality variation. On an average, 22% improvement has been observed in terms of standard deviation of frame PSNR values when compared to JM 19.0 rate control scheme implemented in high resolution video conferencing scenario.
Describing the contents of an image automatically has been a fundamental problem in the field of artificial intelligence and computervision. Existing approaches are either top-down, which start from a simple represen...
详细信息
ISBN:
(纸本)9781450366151
Describing the contents of an image automatically has been a fundamental problem in the field of artificial intelligence and computervision. Existing approaches are either top-down, which start from a simple representation of an image and convert it into a textual description;or bottom-up, which come up with attributes describing numerous aspects of an image to form the caption or a combination of both. Recurrent neural networks (RNN) enhanced by Long Short-Term Memory networks (LSTM) have become a dominant component of several frameworks designed for solving the image captioning task. Despite their ability to reduce the vanishing gradient problem, and capture dependencies, they are inherently sequential across time. In this work, we propose two novel approaches, a top-down and a bottom-up approach independently, which dispenses the recurrence entirely by incorporating the use of a Transformer, a network architecture for generating sequences relying entirely on the mechanism of attention. Adaptive positional encodings for the spatial locations in an image and a new regularization cost during training is introduced. the ability of our model to focus on salient regions in the image automatically is demonstrated visually. Experimental evaluation of the proposed architecture on the MS-COCO dataset is performed to exhibit the superiority of our method.
暂无评论