This paper proposes efficient and robust methods for tracking a moving object at multiple spatial and temporal resolution levels. The efficiency comes from optimising the amounts of spatial and temporal data processed...
详细信息
ISBN:
(纸本)9781424442195
This paper proposes efficient and robust methods for tracking a moving object at multiple spatial and temporal resolution levels. The efficiency comes from optimising the amounts of spatial and temporal data processed. The robustness results from multi-level coarse-to-fine state-space searching. Tracking across resolution levels incurs a accuracy-versus-speed trade-off. For example, tracking at higher resolutions incurs greater processing cost, while maintaining higher accuracy in estimating the position of the moving object. We propose a novel spatial multi-scale tracker that tracks at the optimal accuracy-versus-speed operating point. Next, we relax this requirement to propose a multi-resolution tracker that operates at a minimum acceptable performance level. Finally, we extend these ideas to a multi-resolution spatio-temporal tracker We show results of extensive experimentation in support of the proposed approaches.
Detection of representative frames, also called key-frames, is essential for efficient indexing, browsing and retrieval of video data and also for video summarization. Once a video stream is segmented into shots, the ...
详细信息
ISBN:
(纸本)9781424442195
Detection of representative frames, also called key-frames, is essential for efficient indexing, browsing and retrieval of video data and also for video summarization. Once a video stream is segmented into shots, the representative frames or key-frames for the shot are selected. The number of such frames in a shot may vary depending on the variation it? the content. Thus, for a wide variety of shots automatic selection of suitable number of representative frames still remains a challenge. In this work, we propose a novel scheme for key-frame detection by dividing an available shot into subshots using hypothesis testing and majority voting. Each subshot is supposed to be uniform in terms of visual content. Then for each subshot, the frame rendering the highest fidelity is extracted as the key-frame. Experimental result shows that the scheme works satisfactorily for a wide variety of shots.
Complete design and implementation of a robust palm biometrics recognition and verification system has been presented. The paper attempts to scientifically develop a comprehensive set of hand geometry features and dev...
详细信息
ISBN:
(纸本)9781424442195
Complete design and implementation of a robust palm biometrics recognition and verification system has been presented. The paper attempts to scientifically develop a comprehensive set of hand geometry features and develop an original algorithm from a fundamental level to robustly compute a selected set of features so as to minimize palm placement effect. These features are combined with chrominance features, to achieve recognition and verification accuracy, significant with respect to the current state in palm biometrics research. The algorithm has been kept robust, simple and computationally efficient while the implementation is relatively inexpensive. The experiments on recognition, Principal Components Analysis (PCA) and verification, on a dataset of 100 users strongly confirm the utility of robustly calculating a comprehensive set Of scientifically selected hand geometry features. The results uncover the potential of hand geometry features and confirm that the system can be used in medium to high security environments.
This paper discusses image decomposition problem of the 3-layer MRC model based coding of scanned (noisy) document images. A widely-used approach for document decomposition is to divide the document image into blocks ...
详细信息
ISBN:
(纸本)9781424442195
This paper discusses image decomposition problem of the 3-layer MRC model based coding of scanned (noisy) document images. A widely-used approach for document decomposition is to divide the document image into blocks and split the pixel histogram of each block into two halves by minimizing the sum of variance of its pixels with the mean of the halms. We propose to split a block by minimizing the variance of one half with its minimum pixel and the variance of the other half with its maximum pixel. Our goal is to increase the gap between the two halves by avoiding splitting of any cluster of pixels into both halves. It should help reduce complexity of the generated mask. Moreover, we do not decompose a block if it has no edge points, again to reduce the mask complexity. We also implement a noise reduction heuristic in the mask layer to correct placement of transition pixels. We provide simple analysis and evaluate block energy in terms of the DCT coefficients of the resulting FG/BG layer blocks. Experimental results show that code size of the mask layer of our test images, obtained using proposed processing is reduced to nearly half of the mask obtained by a straightforward 3-MRC implementation.
Range images captured from range scanning devices such as laser scanners or PMD (photonic mixer device) cameras, often possess drawbacks of having low resolution and/or missing regions due to occlusions, reflectivity ...
详细信息
ISBN:
(纸本)9781424442195
Range images captured from range scanning devices such as laser scanners or PMD (photonic mixer device) cameras, often possess drawbacks of having low resolution and/or missing regions due to occlusions, reflectivity limited scanning area, sensor imperfections etc. In this work, we address both the issues in a single framework. We employ Bayesian regularization for resolution enhancement and inpainting in a general multi-image super-resolution scenario. We modify the traditional image formation model used in image/range super-resolution to account for the missing regions. This modification is important to couple the inpainting process with super-resolution. We also stress the importance of prior information in the integration and note that we require the priors to constrain the solution differently for inpainting and for super-resolution. The proposed inhomogeneous prior handles the requirements for inpainting as well as super-resolution. The modification of the imaging model and the formulation of the inhomogeneous prior are both important for the success of the integration. Our results show inpainting of large missing regions, reduction in distortions and good preservation of details at the high-resolution.
An Experiential sampling and Meanshift tracker based Multi-view face detection in video is proposed in this paper In this framework, instead of performing face detection at every position in a frame, we determine cert...
详细信息
ISBN:
(纸本)9781424442195
An Experiential sampling and Meanshift tracker based Multi-view face detection in video is proposed in this paper In this framework, instead of performing face detection at every position in a frame, we determine certain key positions to run the multiview face detectors. These key positions are statistical samples drawn from a density function that is estimated based on color cues, past detection results, Meanshift tracker results and a temporal continuity model. These samples are then propogated using a Particle filter framework. We use a Meanshift tracker to track faces that are missed by the multiview face detectors. Our framework results in a significant reduction in computation time and accounts for the detection of complete 180 degree pose of the face. We also come up with a novel likelihood measure for track termination, which becomes important when used for detection purposes.
We describe a novel method for human activity segmentation and interpretation in surveillance applications based on Gabor filter-bank features. A complex human activity is modeled as a sequence of elementary human act...
详细信息
ISBN:
(纸本)9781424442195
We describe a novel method for human activity segmentation and interpretation in surveillance applications based on Gabor filter-bank features. A complex human activity is modeled as a sequence of elementary human actions like walking, running, jogging, boxing, hand-waving etc. Since human silhouette can be modeled by a set of rectangles, the elementary human actions can be modeled as a sequence of a set of rectangles with different orientations and scales. The activity segmentation is based on Gabor filter-bank features and normalized spectral clustering. The feature trajectories of an action category are learnt from training example videos using Dynamic Time Warping. The combined segmentation and the recognition processes are very efficient as both the algorithms share the same framework and Gabor features computed for the former can be used for the later We have also proposed a simple shadow detection technique to extract good silhouette which is necessary for good accuracy of an action recognition technique.
In several scientific areas, data are sampled irregularly and insufficiently due to practical and economical limitations. The use of such data in applications results in some artifacts and poor spatial resolution. The...
详细信息
ISBN:
(纸本)9781424442195
In several scientific areas, data are sampled irregularly and insufficiently due to practical and economical limitations. The use of such data in applications results in some artifacts and poor spatial resolution. Therefore, before being used, the data are to be interpolated onto a regular grid. One of the methods achieving this objective is based on the Fourier reconstruction, which deals with the under-determined system of equations. The Stagewise Orthogonal Matching Pursuit (StOMP) is a recently proposed greedy algorithm. Compared to the other recent algorithms like l(1) minimization techniques, StOMP admits certain promising features such as faster and simpler implementation even in large scale settings. The present work applies StOMP to the Fourier-based interpolation problem for the signals that have sparse Fourier spectra. The basic objective is to verify empirically the performance of the algorithm if and how far the measurement coordinates can be shifted from uniform distribution on the continuous interval. Taking kurtosis as a quantifier for the deviation of distribution from being uniform, we show numerically that the measurement coordinates can be significantly shifted from uniform distribution.
In this work, we propose a simple yet highly effective algorithm for tracking a target through significant scale and orientation change. We divide the target into a number of fragments and tracking of the whole target...
详细信息
ISBN:
(纸本)9781424442195
In this work, we propose a simple yet highly effective algorithm for tracking a target through significant scale and orientation change. We divide the target into a number of fragments and tracking of the whole target is achieved by coordinated tracking of the individual fragments. We use the mean shift algorithm to move the individual fragments to the nearest minima, though an), other method like integral histograms could also be used. In contrast to the other fragment based approaches, which fix the relative positions of fragments within the target, we permit the fragments to move freely within certain bounds. Furthermore, we use a constant velocity Kalman filter for two purposes. Firstly, Kalman filter achieves robust tracking because of usage of a motion model. Secondly, to maintain coherence amongst the fragments, we use a coupled state transition model for the Kalman filter Using the proposed tracking algorithm, we have experimented on several videos consisting of several hundred frames length each and obtained excellent results.
This paper proposes a transform domain data-hiding scheme for quality access control of images. The original image is decomposed into tiles by applying n-level lifting-based Discrete Wavelet Transformation (DWT). A bi...
详细信息
ISBN:
(纸本)9781424442195
This paper proposes a transform domain data-hiding scheme for quality access control of images. The original image is decomposed into tiles by applying n-level lifting-based Discrete Wavelet Transformation (DWT). A binary watermark image (external information) is spatially dispersed using the sequence of number generated by a secret key. The encoded watermark bits are then embedded into all DWT-coefficients of n(th)-level and only in the high-high (HH) coefficients of the subsequent levels using dither modulation (DM) but without complete self-noise suppression. It is well known that due to insertion of external information, there will be degradation in visual quality of the host image (cover). The degree of deterioration depends on the amount of external data insertion as well as step size used for DM. If this insertion process is reverted, better quality of images can be accessed. To achieve that goal, watermark bits are detected using minimum distance decoder and the remaining self-noise due to information embedding is suppressed to provide better quality of image. The simulation results have shown the validity of this claim.
暂无评论