Peaks extraction is a kind of post-process in many image application or vision tasks that can be used for finding the optimum solution in the solution space. In this paper a real time method is proposed. A candidate q...
详细信息
ISBN:
(纸本)9780819469502
Peaks extraction is a kind of post-process in many image application or vision tasks that can be used for finding the optimum solution in the solution space. In this paper a real time method is proposed. A candidate queue is first build for containing highest peaks in the image in ascending order. Then the image is scanned in sequence. At each scanning position every candidate in the queue is updated respectively by some criterions given in this paper. After the image is scanned over, the highest peaks in the image is achieved in the queue. All the process can be accomplished by logic circuit, so the method is very suitable for hardware system such as FPGA and so on.
In this paper we address the problem of localisation and recognition of human activities in unsegmented image sequences. The main contribution of the proposed method is the use of an implicit representation of the spa...
详细信息
ISBN:
(纸本)9781424439942
In this paper we address the problem of localisation and recognition of human activities in unsegmented image sequences. The main contribution of the proposed method is the use of an implicit representation of the spatiotemporal shape of the activity which relies on the spatiotemporal localization of characteristic, sparse, 'visual words' and 'visual verbs'. Evidence for the spatiotemporal localization of the activity are accumulated in a probabilistic spatiotemporal voting scheme. The local nature of our voting framework allows us to recover multiple activities that take place in the same scene, as well as activities in the presence of clutter and occlusions. We construct class-specific codebooks using the descriptors in the training set, where we take the spatial co-occurrences of pairs of codewords into account. The positions of the codeword pairs with respect to the object centre, as well as the frame in the training set in which they occur are subsequently stored in order to create a spatiotemporal model of codeword co-occurrences. During the testing phase, we use Mean Shift Mode estimation in order to spatially segment the subject that performs the activities in every frame, and the Radon transform in order to extract the most probable hypotheses concerning the temporal segmentation of the activities within the continuous stream.
This paper presents a novel approach for human identification at a distance using gait recognition. recognition of a person from their gait is a biometric of increasing interest. The proposed work introduces a nonline...
详细信息
This paper presents a novel approach for human identification at a distance using gait recognition. recognition of a person from their gait is a biometric of increasing interest. The proposed work introduces a nonlinear machine learning method, kernel Principal Component Analysis (PCA), to extract gait features from silhouettes for individual recognition. Binarized silhouette of a motion object is first represented by four 1-D signals which are the basic image features called the distance vectors. Fourier transform is performed to achieve translation invariant for the gait patterns accumulated from silhouette sequences which are extracted from different circumstances. Kernel PCA is then used to extract higher order relations among the gait patterns for future recognition. A fusion strategy is finally executed to produce a final decision. The experiments are carried out on the CMU and the USF gait databases and presented based on the different training gait cycles.
Wide baseline stereo correspondence has become a challenging and attractive problem in computervision and its related applications. Getting high correct ratio initial matches is a very important step of general wide ...
详细信息
ISBN:
(纸本)9780819469526
Wide baseline stereo correspondence has become a challenging and attractive problem in computervision and its related applications. Getting high correct ratio initial matches is a very important step of general wide baseline stereo correspondence algorithm. Ferrari et al. suggested a voting scheme called topological filter in [3] to discard mismatches from initial matches, but they didn't give theoretical analysis of their method. Furthermore, the parameter of their scheme was uncertain. In this paper, we improved Ferraris' method based on our theoretical analysis, and presented a novel scheme called topologically clustering to discard mismatches. The proposed method has been tested using many famous wide baseline image pairs and the experimental results showed that the developed method can efficiently extract high correct ratio matches from low correct ratio initial matches for wide baseline image pairs.
We introduce a theoretical framework and practical algorithms for replacing time-coded structured light patterns with viewpoint codes, in the form of additional camera locations. Current structured light methods typic...
详细信息
ISBN:
(纸本)9781424411795
We introduce a theoretical framework and practical algorithms for replacing time-coded structured light patterns with viewpoint codes, in the form of additional camera locations. Current structured light methods typically use log(N) light patterns, encoded overtime, to unambiguously reconstruct N unique depths. We demonstrate that each additional camera location may replace one frame in a temporal binary code. Our theoretical viewpoint coding analysis shows that, by using a high frequency stripe pattern and placing cameras in carefully selected locations, the epipolar projection in each camera can be made to mimic the binary encoding patterns normally projected over time. Results from our practical implementation demonstrate reliable depth reconstruction that makes neither temporal nor spatial continuity assumptions about the scene being captured.
This paper presents a novel spatio-temporal Markov random field (MRF) for video denoising. Two main issues are addressed in this paper, namely, the estimation of noise model and the proper use of motion estimation in ...
详细信息
ISBN:
(纸本)9781424411795
This paper presents a novel spatio-temporal Markov random field (MRF) for video denoising. Two main issues are addressed in this paper, namely, the estimation of noise model and the proper use of motion estimation in the denoising process. Unlike previous algorithms which estimate the level of noise, our method learns the full noise distribution nonparametrically which serves as the likelihood model in the MRF. Instead of using deterministic motion estimation to align pixels, we set up a temporal likelihood by combining a probabilistic motion field with the learned noise model. The prior of this MRF is modeled by piece-wise smoothness. The main advantage of the proposed spatio-temporal MRF is that it integrates spatial and temporal information adoptively into a statistical inference framework, where the posteriori is optimized using graph cuts with alpha expansion. We demonstrate the performance of the proposed approach on benchmark data sets and real videos to show the advantages of our algorithm compared with previous single frame and multi-frame algorithms.
Linear and affine subspaces are commonly used to describe appearance of objects under different lighting, viewpoint, articulation, and identity. A natural problem arising from their use is - given a query image portio...
详细信息
ISBN:
(纸本)9781424411795
Linear and affine subspaces are commonly used to describe appearance of objects under different lighting, viewpoint, articulation, and identity. A natural problem arising from their use is - given a query image portion represented as a point in some high dimensional space -find a subspace near to the query. This paper presents an efficient solution to the approximate nearest subspace problem for both linear and affine subspaces. Our method is based on a simple reduction to the problem of nearest point search, and can thus employ tree based search or locality sensitive hashing to find a near subspace. Further speedup may be achieved by using random projections to lower the dimensionality of the problem. We provide theoretical proofs of correctness and error bounds of our construction and demonstrate its capabilities on synthetic and real data. Our experiments demonstrate that an approximate nearest subspace can be located significantly faster than the exact nearest subspace, while at the same time it can find better matches compared to a similar search on points, in the presence of variations due to viewpoint, lighting etc.
Discriminant Analysis (DA) methods have demonstrated their utility in countless applications in computervision and other areas of research - especially in the C class classification problem. The most popular approach...
详细信息
ISBN:
(纸本)9781424411795
Discriminant Analysis (DA) methods have demonstrated their utility in countless applications in computervision and other areas of research - especially in the C class classification problem. The most popular approach is Linear DA (LDA), which provides the C-1-dimensional Bayes optimal solution, but only when all the class covariance matrices are identical. This is rarely the case in practice. To alleviate this restriction, Kernel LDA (KLDA) has been proposed. In this approach, we first (intrinsically) map the original nonlinear problem to a linear one and then use LDA to find the C-1-dimensional Bayes optimal subspace. However, the use of KLDA is hampered by its computational cost, given by the number of training samples available and by the limitedness of LDA in providing a C-1-dimensional solution space. In this paper, we first extend the definition of LDA to provide subspace of q < C-1 dimensions where the Bayes error is minimized. Then, to reduce the computational burden of the derived solution, we define a sparse kernel representation, which is able to automatically select the most appropriate sample feature vectors that represent the kernel. We demonstrate the superiority of the proposed approach on several standard datasets. Comparisons are drawn with a large number of known DA algorithms.
Effective image prior is necessary for image super resolution, due to its severely under-determined nature. Although the edge smoothness prior can be effective, it is generally difficult to have analytical forms to ev...
详细信息
ISBN:
(纸本)9781424411795
Effective image prior is necessary for image super resolution, due to its severely under-determined nature. Although the edge smoothness prior can be effective, it is generally difficult to have analytical forms to evaluate the edge smoothness, especially for soft edges that exhibit gradual intensity transitions. This paper finds the connection between the soft edge smoothness and a soft cut metric on an image grid by generalizing the Geocuts method [5], and proves that the soft edge smoothness measure approximates the average length of all level lines in an intensity image. This new finding not only leads to an analytical characterization of the soft edge smoothness prior, but also gives an intuitive geometric explanation. Regularizing the super resolution problem by this new form of prior can simultaneously minimize the length of all level lines, and thus resulting in visually appealing results. In addition, this paper presents a novel combination of this soft edge smoothness prior and the alpha matting technique for color image super resolution, by normalizing edge segments with their alpha channel description, to achieve a unified treatment of edges with different contrast and scale.
Recent results on stereo indicate that an accurate segmentation is crucial for obtaining faithful depth maps. Variational methods have successfully been applied to both image segmentation and computational stereo. In ...
详细信息
ISBN:
(纸本)9781424411795
Recent results on stereo indicate that an accurate segmentation is crucial for obtaining faithful depth maps. Variational methods have successfully been applied to both image segmentation and computational stereo. In this paper we propose a combination in a unified framework. In particular, we use a Mumford-Shah-like functional to compute a piecewise smooth depth map of a stereo pair. Our approach has two novel features: First, the regularization term of the functional combines edge information obtained from the color segmentation with flow-driven depth discontinuities emerging during the optimization procedure. Second, we propose a robust data term which adaptively selects the best matches obtained from different weak stereo algorithms. We integrate these features in a theoretically consistent framework. The final depth map is the minimizer of the energy functional, which can be solved by the associated functional derivatives. The underlying numerical scheme allows an efficient implementation on modern graphics hardware. We illustrate the performance of our algorithm using the Middlebury database as well as on real imagery.
暂无评论