Most existing pose robust methods are too computational complex to meet practical applications and their performance under unconstrained environments are rarely evaluated. In this paper, we propose a novel method for ...
详细信息
ISBN:
(纸本)9780769549897
Most existing pose robust methods are too computational complex to meet practical applications and their performance under unconstrained environments are rarely evaluated. In this paper, we propose a novel method for pose robust face recognition towards practical applications, which is fast, pose robust and can work well under unconstrained environments. Firstly, a 3D deformable model is built and a fast 3D model fitting algorithm is proposed to estimate the pose of face image. Secondly, a group of Gabor filters are transformed according to the pose and shape of face image for feature extraction. Finally, PCA is applied on the pose adaptive Gabor features to remove the redundances and Cosine metric is used to evaluate the similarity. the proposed method has three advantages: (1) the pose correction is applied in the filter space rather than image space, which makes our method less affected by the precision of the 3D model;(2) By combining the holistic pose transformation and local Gabor filtering, the final feature is robust to pose and other negative factors in face recognition;(3) the 3D structure and facial symmetry are successfully used to deal with self-occlusion. Extensive experiments on FERET and PIE show the proposed method outperforms state-of-the-art methods significantly, meanwhile, the method works well on LFW.
In this work, we return to the underlying mathematical definition of a manifold and directly characterise learning a manifold as finding an atlas, or a set of overlapping charts, that accurately describe local structu...
详细信息
ISBN:
(纸本)9780769549897
In this work, we return to the underlying mathematical definition of a manifold and directly characterise learning a manifold as finding an atlas, or a set of overlapping charts, that accurately describe local structure. We formulate the problem of learning the manifold as an optimisation that simultaneously refines the continuous parameters defining the charts, and the discrete assignment of points to charts. In contrast to existing methods, this direct formulation of a manifold does not require "unwrapping" the manifold into a lower dimensional space and allows us to learn closed manifolds of interest to vision, such as those corresponding to gait cycles or camera pose. We report state-of-the-art results for manifold based nearest neighbour classification on vision datasets, and show how the same techniques can be applied to the 3D reconstruction of human motion from a single image.
this paper proposes motionlet, a mid-level and spatiotemporal part, for human motion recognition. Motionlet can be seen as a tight cluster in motion and appearance space, corresponding to the moving process of differe...
详细信息
ISBN:
(纸本)9780769549897
this paper proposes motionlet, a mid-level and spatiotemporal part, for human motion recognition. Motionlet can be seen as a tight cluster in motion and appearance space, corresponding to the moving process of different body parts. We postulate three key properties of motionlet for action recognition: high motion saliency, multiple scale representation, and representative-discriminative ability. Towards this goal, we develop a data-driven approach to learn motionlets from training videos. First, we extract 3D regions with high motion saliency. then we cluster these regions and preserve the centers as candidate templates for motionlet. Finally, we examine the representative and discriminative power of the candidates, and introduce a greedy method to select effective candidates. With motionlets, we present a mid-level representation for video, called motionlet activation vector. We conduct experiments on three datasets, Kth, HMDB51, and UCF50. the results show that the proposed methods significantly outperform state-of-the-art methods.
Collective motions are common in crowd systems and have attracted a great deal of attention in a variety of multidisciplinary fields. Collectiveness, which indicates the degree of individuals acting as a union in coll...
详细信息
ISBN:
(纸本)9780769549897
Collective motions are common in crowd systems and have attracted a great deal of attention in a variety of multidisciplinary fields. Collectiveness, which indicates the degree of individuals acting as a union in collective motion, is a fundamental and universal measurement for various crowd systems. By integrating path similarities among crowds on collective manifold, this paper proposes a descriptor of collectiveness and an efficient computation for the crowd and its constituent individuals. the algorithm of the Collective Merging is then proposed to detect collective motions from random motions. We validate the effectiveness and robustness of the proposed collectiveness descriptor on the system of self-driven particles. We then compare the collectiveness descriptor to human perception for collective motion and show high consistency. Our experiments regarding the detection of collective motions and the measurement of collectiveness in videos of pedestrian crowds and bacteria colony demonstrate a wide range of applications of the collectiveness descriptor(1).
Detecting pedestrians in cluttered scenes is a challenging problem in computervision. the difficulty is added when several pedestrians overlap in images and occlude each other. We observe, however, that the occlusion...
详细信息
ISBN:
(纸本)9780769549897
Detecting pedestrians in cluttered scenes is a challenging problem in computervision. the difficulty is added when several pedestrians overlap in images and occlude each other. We observe, however, that the occlusion/visibility statuses of overlapping pedestrians provide useful mutual relationship for visibility estimation - the visibility estimation of one pedestrian facilitates the visibility estimation of another. In this paper, we propose a mutual visibility deep model that jointly estimates the visibility statuses of overlapping pedestrians. the visibility relationship among pedestrians is learned from the deep model for recognizing co-existing pedestrians. Experimental results show that the mutual visibility deep model effectively improves the pedestrian detection results. Compared with existing image-based pedestrian detection approaches, our approach has the lowest average miss rate on the Caltech-Train dataset, the Caltech-Test dataset and the Eth dataset. Including mutual visibility leads to 4%- 8% improvements on multiple benchmark datasets.
Recent work in monocular pedestrian detection is trying to improve the execution time while keeping the accuracy as high as possible.A popular and successful approach for monocular intensity pedestrian detection is ba...
Recent work in monocular pedestrian detection is trying to improve the execution time while keeping the accuracy as high as possible.A popular and successful approach for monocular intensity pedestrian detection is based on the approximation(instead of computation) of image features for multiple scales based on the features computed on set of predefined *** port this idea to the infrared *** contributions reside in the combination of four channel features,namely infrared,histogram of gradient orientations,normalized gradient magnitude and local binary patterns withthe objective of detecting pedestrians for night vision applications dealing with far infrared *** scale feature computation is done by feature *** contribution is the study of different formulations for Local Binary patterns like uniform patterns and rotation invariant patterns and their effect on detection *** detection speed is also boosted by the aid of a fast morphological based region of interest *** vary the number of approximated scales per octave and study the impact on execution time and accuracy.A reasonable result hits a speed of 18 fps with a log average miss rate of 39%.
We address the problem of recovering camera motion from video data, which does not require the establishment of feature correspondences or computation of optical flows but from normal flows directly. We have designed ...
详细信息
ISBN:
(纸本)9780769549897
We address the problem of recovering camera motion from video data, which does not require the establishment of feature correspondences or computation of optical flows but from normal flows directly. We have designed an imaging system that has a wide field of view by fixating a number of cameras together to form an approximate spherical eye. With a substantially widened visual field, we discover that estimating the directions of translation and rotation components of the motion separately are possible and particularly efficient. In addition, the inherent ambiguities between translation and rotation also disappear. Magnitude of rotation is recovered subsequently. Experimental results on synthetic and real image data are provided. the results show that not only the accuracy of motion estimation is comparable to those of the state-of-the-art methods that require explicit feature correspondences or optical flows, but also a faster computation time.
In this paper, we propose a new approach for matching images observed in different camera views with complex cross-view transforms and apply it to person re-identification. It jointly partitions the image spaces of tw...
详细信息
ISBN:
(纸本)9780769549897
In this paper, we propose a new approach for matching images observed in different camera views with complex cross-view transforms and apply it to person re-identification. It jointly partitions the image spaces of two camera views into different configurations according to the similarity of cross-view transforms. the visual features of an image pair from different views are first locally aligned by being projected to a common feature space and then matched with softly assigned metrics which are locally optimized. the features optimal for recognizing identities are different from those for clustering cross-view transforms. they are jointly learned by utilizing sparsity-inducing norm and information theoretical regularization. this approach can be generalized to the settings where test images are from new camera views, not the same as those in the training set. Extensive experiments are conducted on public datasets and our own dataset. Comparisons withthe state-of-the-art metric learning and person re-identification methods show the superior performance of our approach.
In this paper we present a novel non-rigid optical flow algorithm for dense image correspondence and non-rigid registration. the algorithm uses a unique Laplacian Mesh Energy term to encourage local smoothness whilst ...
详细信息
ISBN:
(纸本)9780769549897
In this paper we present a novel non-rigid optical flow algorithm for dense image correspondence and non-rigid registration. the algorithm uses a unique Laplacian Mesh Energy term to encourage local smoothness whilst simultaneously preserving non-rigid deformation. Laplacian deformation approaches have become popular in graphics research as they enable mesh deformations to preserve local surface shape. In this work we propose a novel Laplacian Mesh Energy formula to ensure such sensible local deformations between image pairs. We express this wholly within the optical flow optimization, and show its application in a novel coarse-to-fine pyramidal approach. Our algorithm achieves the state-of-the-art performance in all trials on the Garg et al. dataset, and top tier performance on the Middlebury evaluation.
A fundamental limitation of quantization techniques like the k-means clustering algorithm is the storage and run-time cost associated withthe large numbers of clusters required to keep quantization errors small and m...
详细信息
ISBN:
(纸本)9780769549897
A fundamental limitation of quantization techniques like the k-means clustering algorithm is the storage and run-time cost associated withthe large numbers of clusters required to keep quantization errors small and model fidelity high. We develop new models with a compositional parameterization of cluster centers, so representational capacity increases super-linearly in the number of parameters. this allows one to effectively quantize data using billions or trillions of centers. We formulate two such models, Orthogonal k-means and Cartesian k-means. they are closely related to one another, to k-means, to methods for binary hash function optimization like ITQ [5], and to Product Quantization for vector quantization [7]. the models are tested on large-scale ANN retrieval tasks (1M GIST, 1B SIFT features), and on codebook learning for object recognition (CIFAR-10).
暂无评论