Curse of dimensionality is a practical and challenging problem in image categorization, especially in cases with a large number of classes. Multi-class classification encounters severe computational and storage proble...
详细信息
ISBN:
(纸本)9781479951178
Curse of dimensionality is a practical and challenging problem in image categorization, especially in cases with a large number of classes. Multi-class classification encounters severe computational and storage problems when dealing withthese large scale tasks. In this paper, we propose hierarchical feature hashing to effectively reduce dimensionality of parameter space without sacrificing classification accuracy, and at the same time exploit information in semantic taxonomy among categories. We provide detailed theoretical analysis on our proposed hashing method. Moreover, experimental results on object recognition and scene classification further demonstrate the effectiveness of hierarchical feature hashing.
We propose binary range-sample feature in depth. It is based on t tests and achieves reasonable invariance with respect to possible change in scale, viewpoint, and background. It is robust to occlusion and data corrup...
详细信息
ISBN:
(纸本)9781479951178
We propose binary range-sample feature in depth. It is based on t tests and achieves reasonable invariance with respect to possible change in scale, viewpoint, and background. It is robust to occlusion and data corruption as well. the descriptor works in a high speed thanks to its binary property. Working together with standard learning algorithms, the proposed descriptor achieves state-of-the-art results on benchmark datasets in our experiments. Impressively short running time is also yielded.
this paper introduces a novel image representation capturing feature dependencies through the mining of meaningful combinations of visual features. this representation leads to a compact and discriminative encoding of...
详细信息
ISBN:
(纸本)9781479951178
this paper introduces a novel image representation capturing feature dependencies through the mining of meaningful combinations of visual features. this representation leads to a compact and discriminative encoding of images that can be used for image classification, object detection or object recognition. the method relies on (i) multiple random projections of the input space followed by local binarization of projected histograms encoded as sets of items, and (ii) the representation of images as Histograms of pattern Sets (HoPS). the approach is validated on four publicly available datasets (Daimler Pedestrian, Oxford Flowers, Kth Texture and PASCAL VOC2007), allowing comparisons with many recent approaches. the proposed image representation reaches state-of-the-art performance on each one of these datasets.
Humans are capable of perceiving a scene at a glance, and obtain deeper understanding with additional time. Similarly, visual recognition deployments should be robust to varying computational budgets. Such situations ...
详细信息
ISBN:
(纸本)9781479951178
Humans are capable of perceiving a scene at a glance, and obtain deeper understanding with additional time. Similarly, visual recognition deployments should be robust to varying computational budgets. Such situations require Anytime recognition ability, which is rarely considered in computervision research. We present a method for learning dynamic policies to optimize Anytime performance in visual architectures. Our model sequentially orders feature computation and performs subsequent classification. Crucially, decisions are made at test time and depend on observed data and intermediate results. We show the applicability of this system to standard problems in scene and object recognition. On suitable datasets, we can incorporate a semantic back-off strategy that gives maximally specific predictions for a desired level of accuracy;this provides a new view on the time course of human visual perception.
We introduce a new approach for recognizing and reconstructing 3D objects in images. Our approach is based on an analysis by synthesis strategy. A forward synthesis model constructs possible geometric interpretations ...
详细信息
ISBN:
(纸本)9781479951178
We introduce a new approach for recognizing and reconstructing 3D objects in images. Our approach is based on an analysis by synthesis strategy. A forward synthesis model constructs possible geometric interpretations of the world, and then selects the interpretation that best agrees withthe measured visual evidence. the forward model synthesizes visual templates defined on invariant (HOG) features. these visual templates are discriminatively trained to be accurate for inverse estimation. We introduce an efficient "brute-force" approach to inference that searches through a large number of candidate reconstructions, returning the optimal one. One benefit of such an approach is that recognition is inherently (re) constructive. We show state of the art performance for detection and reconstruction on two challenging 3D object recognition datasets of cars and cuboids.
Facial feature detection from facial images has attracted great attention in the field of computervision. It is a non-trivial task since the appearance and shape of the face tend to change under different conditions....
详细信息
ISBN:
(纸本)9781479951178
Facial feature detection from facial images has attracted great attention in the field of computervision. It is a non-trivial task since the appearance and shape of the face tend to change under different conditions. In this paper, we propose a hierarchical probabilistic model that could infer the true locations of facial features given the image measurements even if the face is with significant facial expression and pose. the hierarchical model implicitly captures the lower level shape variations of facial components using the mixture model. Furthermore, in the higher level, it also learns the joint relationship among facial components, the facial expression, and the pose information through automatic structure learning and parameter estimation of the probabilistic model. Experimental results on benchmark databases demonstrate the effectiveness of the proposed hierarchical probabilistic model.
We present an approach that takes a single photograph of a child as input and automatically produces a series of age-progressed outputs between 1 and 80 years of age, accounting for pose, expression, and illumination....
详细信息
ISBN:
(纸本)9781479951178
We present an approach that takes a single photograph of a child as input and automatically produces a series of age-progressed outputs between 1 and 80 years of age, accounting for pose, expression, and illumination. Leveraging thousands of photos of children and adults at many ages from the Internet, we first show how to compute average image subspaces that are pixel-to-pixel aligned and model variable lighting. these averages depict a prototype man and woman aging from 0 to 80, under any desired illumination, and capture the differences in shape and texture between ages. Applying these differences to a new photo yields an age progressed result. Contributions include relightable age subspaces, a novel technique for subspace-to-subspace alignment, and the most extensive evaluation of age progression techniques in the literature(1).
We present a system that demonstrates how the compositional structure of events, in concert withthe compositional structure of language, can interplay withthe underlying focusing mechanisms in video action recogniti...
详细信息
ISBN:
(纸本)9781479951178
We present a system that demonstrates how the compositional structure of events, in concert withthe compositional structure of language, can interplay withthe underlying focusing mechanisms in video action recognition, providing a medium for top-down and bottom-up integration as well as multi-modal integration between vision and language. We show how the roles played by participants (nouns), their characteristics (adjectives), the actions performed (verbs), the manner of such actions (adverbs), and changing spatial relations between participants (prepositions), in the form of whole-sentence descriptions mediated by a grammar, guides the activity-recognition process. Further, the utility and expressiveness of our framework is demonstrated by performing three separate tasks in the domain of multi-activity video: sentence-guided focus of attention, generation of sentential description, and query-based search, simply by leveraging the framework in different manners.
We present the first automatic method to remove shadows from single RGB-D images. Using normal cues directly derived from depth, we can remove hard and soft shadows while preserving surface texture and shading. Our ke...
详细信息
ISBN:
(纸本)9781479951178
We present the first automatic method to remove shadows from single RGB-D images. Using normal cues directly derived from depth, we can remove hard and soft shadows while preserving surface texture and shading. Our key assumption is: pixels with similar normals, spatial locations and chromaticity should have similar colors. A modified nonlocal matching is used to compute a shadow confidence map that localizes well hard shadow boundary, thus handling hard and soft shadows within the same framework. We compare our results produced using state-of-the-art shadow removal on single RGB images, and intrinsic image decomposition on standard RGB-D datasets.
Algorithms for solving systems of polynomial equations are key components for solving geometry problems in computervision. Fast and stable polynomial solvers are essential for numerous applications e.g. minimal probl...
详细信息
ISBN:
(纸本)9781479951178
Algorithms for solving systems of polynomial equations are key components for solving geometry problems in computervision. Fast and stable polynomial solvers are essential for numerous applications e.g. minimal problems or finding for all stationary points of certain algebraic errors. Recently, full symmetry in the polynomial systems has been utilized to simplify and speed up state-of-the-art polynomial solvers based on Grobner basis method. In this paper, we further explore partial symmetry (i.e. where the symmetry lies in a subset of the variables) in the polynomial systems. We develop novel numerical schemes to utilize such partial symmetry. We then demonstrate the advantage of our schemes in several computervision problems. In both synthetic and real experiments, we show that utilizing partial symmetry allow us to obtain faster and more accurate polynomial solvers than the general solvers(1).
暂无评论