In this work, we propose a simple yet highly effective algorithm for tracking a target through significant scale and orientation change. We divide the target into a number of fragments and tracking of the whole target...
详细信息
ISBN:
(纸本)9781424442195
In this work, we propose a simple yet highly effective algorithm for tracking a target through significant scale and orientation change. We divide the target into a number of fragments and tracking of the whole target is achieved by coordinated tracking of the individual fragments. We use the mean shift algorithm to move the individual fragments to the nearest minima, though an), other method like integral histograms could also be used. In contrast to the other fragment based approaches, which fix the relative positions of fragments within the target, we permit the fragments to move freely within certain bounds. Furthermore, we use a constant velocity Kalman filter for two purposes. Firstly, Kalman filter achieves robust tracking because of usage of a motion model. Secondly, to maintain coherence amongst the fragments, we use a coupled state transition model for the Kalman filter Using the proposed tracking algorithm, we have experimented on several videos consisting of several hundred frames length each and obtained excellent results.
This paper proposes a transform domain data-hiding scheme for quality access control of images. The original image is decomposed into tiles by applying n-level lifting-based Discrete Wavelet Transformation (DWT). A bi...
详细信息
ISBN:
(纸本)9781424442195
This paper proposes a transform domain data-hiding scheme for quality access control of images. The original image is decomposed into tiles by applying n-level lifting-based Discrete Wavelet Transformation (DWT). A binary watermark image (external information) is spatially dispersed using the sequence of number generated by a secret key. The encoded watermark bits are then embedded into all DWT-coefficients of n(th)-level and only in the high-high (HH) coefficients of the subsequent levels using dither modulation (DM) but without complete self-noise suppression. It is well known that due to insertion of external information, there will be degradation in visual quality of the host image (cover). The degree of deterioration depends on the amount of external data insertion as well as step size used for DM. If this insertion process is reverted, better quality of images can be accessed. To achieve that goal, watermark bits are detected using minimum distance decoder and the remaining self-noise due to information embedding is suppressed to provide better quality of image. The simulation results have shown the validity of this claim.
Bit Plane coding (BPC) constitutes an important component of the EBCOT Tier-1 block of JPEG2000 encoder. This paper proposes an efficient parallel hardware structure to implement the computation intensive word level b...
详细信息
ISBN:
(纸本)9781424442195
Bit Plane coding (BPC) constitutes an important component of the EBCOT Tier-1 block of JPEG2000 encoder. This paper proposes an efficient parallel hardware structure to implement the computation intensive word level bit plane coding algorithm. The proposed architecture computes the context and decision for all bit planes in parallel. The three coding passes are merged for all bit planes in a scan while the samples are coded in sequence. The proposed parallel BPC architecture offers a speed of 31 over the serial BPC architecture. Its memory requirement is independent of the size of the codeblock. The speed of the proposed architecture has been shown to be significantly faster than an architecture which has been recently reported in literature. The system architecture has been functionally verified by ModelSim and synthesized by TSMC 0.25 mu m vtvt CMOS cell libraries.
Automatic organization of large, unordered image collections is an extremely challenging problem with many potential applications. Often, what is required is that images taken in the same place, of the same thing, or ...
详细信息
ISBN:
(纸本)9781424442195
Automatic organization of large, unordered image collections is an extremely challenging problem with many potential applications. Often, what is required is that images taken in the same place, of the same thing, or of the same person be conceptually grouped together. This work focuses on grouping images containing the same object, despite significant changes in scale, viewpoint and partial occlusions, in very large (1M+) image collections automatically gathered from Flickr The scale of the data and the extreme variation in imaging conditions makes the problem very challenging. We describe a scalable method that first computes a matching graph over all the images. image groups can then be mined from this graph using standard clustering techniques. The novelty we bring is that both the matching graph and the clustering methods are able to use the spatial consistency between the images arising from the common object (if there is one). We demonstrate our methods on a publicly available dataset of 5K images of Oxford, a 37K image dataset containing images of the Statue of Liberty and a much larger 1M image dataset of Rome. This is, to our knowledge, the largest dataset to which image-based data mining has been applied.
This paper address new face verification scheme based on Log-Gabor filter (texture based) and Gaussian Mixture Model. The proposed method consists of three parts. The first part is a Log-Gabor filtering on facial imag...
详细信息
As vision algorithms mature with increasing inspiration from the learning community, statistically independent pseudo random number generation (PRNG) becomes increasingly important. At the same time, execution time de...
详细信息
ISBN:
(纸本)9781424442195
As vision algorithms mature with increasing inspiration from the learning community, statistically independent pseudo random number generation (PRNG) becomes increasingly important. At the same time, execution time demands have seen algorithms being implemented on evolving parallel hardware such as GPUs. The Mersenne Twister (MT) [7] has proven to be the current state of the art for generating high quality random numbers, and the Nvidia provided software for parallel MT is in widespread use. While execution time is important, development time is also critical. As processor cardinality changes, a foundation for generating simulations that will vary only in execution time and not in the actual result is useful;otherwise the development time will be impacted. In this paper we present an implementation of the Lagged Fibonacci Generator (LFG) considered to be of quality equal [7] to MT on the GPU. Unlike MT LFG has this important processor-cardinality agnostic capability that is - as the number of processing resources changes, the overall sequence of random numbers remains the same. This feature notwithstanding, our basic implementation is roughly as fast as the parallel MT;an in-memory version is actually 25% faster in execution time. Both parallel MT as well as parallel LFG show enormous speedup over their sequential counterparts. Finally, a prototype particle filter tracking application shows that our method works not just in parallel computing theory, but also in practice for vision applications, providing a decrease of 60% in execution time.
This paper proposes a parallel architecture for a successive elimination algorithm (SEA), which is used in block matching motion estimation. SEA effectively eliminates the search points within the search window and th...
详细信息
ISBN:
(纸本)9781424442195
This paper proposes a parallel architecture for a successive elimination algorithm (SEA), which is used in block matching motion estimation. SEA effectively eliminates the search points within the search window and thus decreases the number of matching evaluation instances that require very intensive computations compared to the standard full search algorithm (FSA). The proposed architecture for SEA decreases the time to calculate the motion vector by 57 percent compared to FSA. The performance while applying the SEA to several standard video clips has been shown to be same compared to the standard FSA. The proposed architecture uses 16 processing elements accompanied with use of intelligent data arrangement and memory configuration. A technique for reducing external memory accesses has also been developed. The proposed architecture for SEA provides an efficient solution for applications requiring real-time motion estimations. For, it serves to compute motion vectors in less amount of time while requiring almost same power and some increase in area compared to a similar architecture for implementing the full search algorithm. A register-transfer level implementation as well as simulation results on benchmark video clips are presented. Relevant design statistics on area and power for comparing between SEA and FSA implementations are also provided.
We propose an evolving scheme to detect slow as well as fast moving objects in a video sequence. The proposed scheme employ, both spatio-temporal and temporal segmentation to obtain the Video Object plane and hence de...
详细信息
ISBN:
(纸本)9781424442195
We propose an evolving scheme to detect slow as well as fast moving objects in a video sequence. The proposed scheme employ, both spatio-temporal and temporal segmentation to obtain the Video Object plane and hence detection. We propose a Compound Markov Random Field Model as the a priori image model that takes into account the spatial distribution of the current frame, temporal frames and the edge maps of the temporal frames. The spatio-temporal segmentation is cast as a pixel labeling problem and the labels are the MAP estimates. The MAP estimates of a frame are obtained by a hybrid algorithm. The spatial segmentation of a given frame evolves to generate the spatial segmentation of the subsequent frames. The evolved spatial segmentation together with the temporal segmentation produces the Video Object Plane (VOP) and hence detection. Our scheme does require the computation of spatio-temporal segmentation of the initial frame thus speeding up the whole process. The results of the proposed scheme are compared with JSEG method are found to be better in terms of the misclassification error
Diabetic retinopathy is one of the major causes of blindness. However diabetic retinopathy does not usually cause a loss of sight until it has reached an advanced stage. The earliest sign of the disease are microaneur...
详细信息
ISBN:
(纸本)9781424442195
Diabetic retinopathy is one of the major causes of blindness. However diabetic retinopathy does not usually cause a loss of sight until it has reached an advanced stage. The earliest sign of the disease are microaneurysms (MA) which appear as small red dots on retinal fundus images. Various screening programmes have been established in the UK and other countries to collect and assess images on a regular basis, especially in the diabetic population. A considerable amount of time and money is spent in manually grading these images, a large percentage of which are normal. By automatically identifying the normal images, the manual workload and costs could be reduced greatly while increasing the effectiveness of the screening programmes. A novel method of microaneurysm detection from digital retinal screening images is proposed. It is based on filtering using complex-valued circular-symmetric filters, and an eigen-image, morphological analysis of the candidate regions to reduce the false-positive rate. We detail the imageprocessing algorithms and present results on a typical set of 89 image from a published database. Our method is shown to have a best operating sensitivity of 82.6% at a specificity of 80.2% which makes it viable for screening. We discuss the results in the context of a model of visual search and the ROC curves that it can predict.
This paper presents a switched predictive coding method for lossless compression of video. In the proposed method, a set of switched predictors is found by a training process that uses only a small number of successiv...
详细信息
ISBN:
(纸本)9781424442195
This paper presents a switched predictive coding method for lossless compression of video. In the proposed method, a set of switched predictors is found by a training process that uses only a small number of successive frames of a video and then the trained predictors are used with a large number of the frames of the video. To find the predictors, the pixels of the successive frames are first classified based on an estimate of activity level in their neighbouring pixels and then LS based feedback type of predictors are estimated for all the pixels belonging to each of the classes. We propose a total of 21 classes, which are obtained by combining the seven slope bins of Gradient Adjusted Predictor (GAP) and three classified temporal contexts. After collecting the predictors for pixels belonging to each of the 21 classes, the best predictor in terms of minimum zero-order entropy, is chosen to represent the various classes. Simulation results show that the application of the set of the predictors results in competitive performance with the LOPT - one of the best methods in terms of achievable compression ratio. Our method and LOPT has same order of coding complexity while our decoder is computationally very simple as against high complexity of LOPT based decoder
暂无评论