the problem of tracking ball in a soccer video is challenging because of sudden change in speed and orientation of the soccer ball. Successful tracking in such a scenario depends on the ability of the algorithm to bal...
详细信息
ISBN:
(纸本)9781450347532
the problem of tracking ball in a soccer video is challenging because of sudden change in speed and orientation of the soccer ball. Successful tracking in such a scenario depends on the ability of the algorithm to balance prior constraints continuously against the evidence garnered from the sequences of images. this paper proposes a particle filter based algorithm that tracks the ball when it changes its direction suddenly or takes high speed. Exact, deterministic tracking algorithms based on discretized functional, suffer from severe limitations in the form of prior constraints. Our tracking algorithm has shown excellent result even for partial occlusion which is a major concern in soccer video. We have shown that the proposed tracking algorithm is at least 7.2% better compared to competing approaches for soccer ball tracking.
Dopaminergic imaging using Single Photon Emission Computed Tomography (SPECT) with I-123-Ioflupane have shown to increase the diagnostic accuracy in Parkinson's Disease (PD). Studies show that around 10% of subjec...
详细信息
ISBN:
(纸本)9781479915880
Dopaminergic imaging using Single Photon Emission Computed Tomography (SPECT) with I-123-Ioflupane have shown to increase the diagnostic accuracy in Parkinson's Disease (PD). Studies show that around 10% of subjects who are clinically diagnosed as PD, have SPECT scans in the normal range and are called Scans Without Evidence of Dopaminergic Deficit (SWEDD) subjects. Subsequent follow-up on these subjects has indicated that they are unlikely to have PD. Detection and differentiation of PD and SWEDD is problematic in the early stages of the disease. Early and accurate diagnosis of PD and also SWEDD is crucial for early management, avoidance of unnecessary medical examinations and therapies;and their side-effects. We in our paper, use the SPECT images from 35 Normal, 36 PD and 38 SWEDD subjects as obtained from the Parkinson's Progression Markers Initiative (PPMI) database, to carry out intensity-based surface fitting using polynomial model. this is the first time that such kind of modeling is carried out on the SPECT images for the characterization of PD. Our results show that the surface profile in terms of model coefficients and goodness-of-fit parameters is different for Normal, Early PD and SWEDD subjects. Such kind of modeling may aid in the diagnosis of early PD and SWEDD from SPECT images.
We propose a framework for synthesis of natural semi cursive handwritten Latin script that can find application in text personalization, or in generation of synthetic data for recognition systems. Our method is based ...
详细信息
ISBN:
(纸本)9781450347532
We propose a framework for synthesis of natural semi cursive handwritten Latin script that can find application in text personalization, or in generation of synthetic data for recognition systems. Our method is based on the generation of synthetic n-gram letter glyphs and their subsequent concatenation. We propose a non-parametric data driven generation scheme that is able to mimic the variation observed in handwritten glyph samples to synthesize natural looking synthetic glyphs. these synthetic glyphs are then stitched together to form complete words, using a spline based concatenation scheme. Further, as a refinement, our method is able to generate pen-lifts, giving our results a natural semi cursive look. through subjective experiments and detailed analysis of the results, we demonstrate the effectiveness of our formulation in being able to generate natural looking synthetic script.
An intrinsic property of real aperture imaging has been that the observations tend to be defocused. this artifact has been used in an innovative manner by researchers for depth estimation, since the amount of defocus ...
详细信息
An intrinsic property of real aperture imaging has been that the observations tend to be defocused. this artifact has been used in an innovative manner by researchers for depth estimation, since the amount of defocus varies with varying depth in the scene. there have been various methods to model the defocus blur. We model the defocus process using the model of diffusion of heat. the diffusion process has been traditionally used in low level vision problems like smoothing, segmentation and edge detection. In this paper a novel application of the diffusion principle is made for generating the defocus space of the scene. the defocus space is the set of all possible observations for a given scene that can be captured using a physical lens system. Using the notion of defocus space we estimate the depth in the scene and also generate the corresponding fully focused equivalent pin-hole image. the algorithm described here also brings out the equivalence of the two modalities, viz. depth from focus and depth from defocus for structure recovery. (c) 2006 Elsevier B.V. All rights reserved.
Video based multimedia services are showing major growth in recent years. though the video coding recommendation ITU-T H.265/HEVC is in operating state, the majority of real-time video applications including video con...
详细信息
ISBN:
(纸本)9781467385640
Video based multimedia services are showing major growth in recent years. though the video coding recommendation ITU-T H.265/HEVC is in operating state, the majority of real-time video applications including video conferencing and live streaming of events rely mainly on less computationally intensive H.264/AVC coding standard which is well established among video industry. Rate control algorithms are indispensable for delivering superior quality video over limited bandwidth connections. In this paper, we propose a rate control technique which provides consistent quality video output over time in a limited bandwidth video conferencing scenario. Using the concept of video traffic prediction and linear complexity model, we compute the complexity of the video sequence in real time. We present a better estimate of quantization parameter at the GOP (group of picture) layer. Our proposed method maintains similar quality for reconstructed video when compared to rate control scheme of JM 19.0 implemented in real-time video conferencing system with restricted quality variation. On an average, 22% improvement has been observed in terms of standard deviation of frame PSNR values when compared to JM 19.0 rate control scheme implemented in high resolution video conferencing scenario.
Describing the contents of an image automatically has been a fundamental problem in the field of artificial intelligence and computervision. Existing approaches are either top-down, which start from a simple represen...
详细信息
ISBN:
(纸本)9781450366151
Describing the contents of an image automatically has been a fundamental problem in the field of artificial intelligence and computervision. Existing approaches are either top-down, which start from a simple representation of an image and convert it into a textual description;or bottom-up, which come up with attributes describing numerous aspects of an image to form the caption or a combination of both. Recurrent neural networks (RNN) enhanced by Long Short-Term Memory networks (LSTM) have become a dominant component of several frameworks designed for solving the image captioning task. Despite their ability to reduce the vanishing gradient problem, and capture dependencies, they are inherently sequential across time. In this work, we propose two novel approaches, a top-down and a bottom-up approach independently, which dispenses the recurrence entirely by incorporating the use of a Transformer, a network architecture for generating sequences relying entirely on the mechanism of attention. Adaptive positional encodings for the spatial locations in an image and a new regularization cost during training is introduced. the ability of our model to focus on salient regions in the image automatically is demonstrated visually. Experimental evaluation of the proposed architecture on the MS-COCO dataset is performed to exhibit the superiority of our method.
We propose a novel technique for event geo-localization (i.e. 2-D location of the event on the surface of the earth) from the sensor metadata of crowd-sourced videos collected from smartphone devices. Withthe help of...
详细信息
ISBN:
(纸本)9781450347532
We propose a novel technique for event geo-localization (i.e. 2-D location of the event on the surface of the earth) from the sensor metadata of crowd-sourced videos collected from smartphone devices. Withthe help of sensors available in the smartphone devices, such as digital compass and GPS receiver, we collect metadata information such as camera viewing direction and location along withthe video. the event localization is then posed as a constrained optimization problem using available sensor metadata. Our results on the collected experimental data shows correct localization of events, which is particularly challenging for classical vision based methods because of the nature of the visual data. Since we only use sensor metadata in our approach, computational overhead is much less compared to what would be if video information is used. At the end, we illustrate the benefits of our work in analyzing the video data from multiple sources through geo-localization.
Matrix factorization technique has been widely used as a popular method to learn a joint latent-compact subspace, when multiple views or modals of objects (belonging to single-domain or multiple-domain) are available....
详细信息
ISBN:
(纸本)9781450347532
Matrix factorization technique has been widely used as a popular method to learn a joint latent-compact subspace, when multiple views or modals of objects (belonging to single-domain or multiple-domain) are available. Our work confronts the problem of learning an informative latent subspace by imparting supervision to matrix factorization for fusing multiple modals of objects, where we devise simpler supervised additive updates instead of multiplicative updates, thus scalable to large scale datasets. To increase the classification accuracy we integrate the label information of images withthe process of learning a semantically enhanced subspace. We perform extensive experiments on two publicly available standard image datasets of NUS WIDE and compare the results with state-of-the-art subspace learning and fusion techniques to evaluate the efficacy of our framework. Improvement obtained in the classification accuracy confirms the effectiveness of our approach. In essence, we propose a novel method for supervised data fusion thus leading to supervised subspace learning.
In Digital Subtraction Angiography (DSA), non-rigid registration of the mask and contrast images to reduce the motion artifacts is a challenging problem. In this paper, we have proposed a novel stratified registration...
详细信息
ISBN:
(纸本)9781450347532
In Digital Subtraction Angiography (DSA), non-rigid registration of the mask and contrast images to reduce the motion artifacts is a challenging problem. In this paper, we have proposed a novel stratified registration framework for DSA artifact reduction. We use quad-trees to generate the non-uniform grid of control points and obtain the sub-pixel displacement offsets using Random Walker (RW). We have also proposed a sequencing logic for the control points and an incremental LU decomposition approach that enables reuse of the computations in the RW step. We have tested our approach using clinical data sets, and found that our registration framework has performed comparable to the graph-cuts (at the same partition level), in regions wherein 95% artifact reduction was achieved. the optimization step achieves a speed improvement of 4.2 times with respect to graph-cuts.
imageprocessing is often considered a good candidate for the application of parallel processing because of the large volumes of data and the complex algorithms commonly encountered. this paper presents a tutorial int...
详细信息
imageprocessing is often considered a good candidate for the application of parallel processing because of the large volumes of data and the complex algorithms commonly encountered. this paper presents a tutorial introduction to the field of parallel imageprocessing. After introducing the classes of parallel processing a brief review of architectures for parallel imageprocessing is presented. Software design for low-level imageprocessing and parallelism in high-level imageprocessing are discussed and an application of parallel processing to handwritten postcode recognition is described. the paper concludes with a look at future technology and market trends.
暂无评论