While domain adaptation (DA) aims to associate the learning tasks across data domains, heterogeneous domain adaptation (HDA) particularly deals with learning from cross-domain data which are of different types of feat...
详细信息
ISBN:
(纸本)9781467388528
While domain adaptation (DA) aims to associate the learning tasks across data domains, heterogeneous domain adaptation (HDA) particularly deals with learning from cross-domain data which are of different types of features. In other words, for HDA, data from source and target domains are observed in separate feature spaces and thus exhibit distinct distributions. In this paper, we propose a novel learning algorithm of Cross-Domain Landmark Selection (CDLS) for solving the above task. Withthe goal of deriving a domain-invariant feature subspace for HDA, our CDLS is able to identify representative cross-domain data, including the unlabeled ones in the target domain, for performing adaptation. In addition, the adaptation capabilities of such cross-domain landmarks can be determined accordingly. this is the reason why our CDLS is able to achieve promising HDA performance when comparing to state-of-the-art HDA methods. We conduct classification experiments using data across different features, domains, and modalities. the effectiveness of our proposed method can be successfully verified.
It is well known that contextual and multi-scale representations are important for accurate visual recognition. In this paper we present the Inside-Outside Net (ION), an object detector that exploits information both ...
详细信息
ISBN:
(纸本)9781467388528
It is well known that contextual and multi-scale representations are important for accurate visual recognition. In this paper we present the Inside-Outside Net (ION), an object detector that exploits information both inside and outside the region of interest. Contextual information outside the region of interest is integrated using spatial recurrent neural networks. Inside, we use skip pooling to extract information at multiple scales and levels of abstraction. through extensive experiments we evaluate the design space and provide readers with an overview of what tricks of the trade are important. ION improves state-of-the-art on PASCAL VOC 2012 object detection from 73.9% to 77.9% mAP. On the new and more challenging MS COCO dataset, we improve state-of-the-art from 19.7% to 33.1% mAP. In the 2015 MS COCO Detection Challenge, our ION model won "Best Student Entry" and finished 3rd place overall. As intuition suggests, our detection results provide strong evidence that context and multi-scale representations improve small object detection.
Colorization refers to the process of adding color to black & white images or videos. this paper extends the term to handle surfaces in three dimensions. this is important for applications in which the colors of a...
详细信息
ISBN:
(纸本)9780769549897
Colorization refers to the process of adding color to black & white images or videos. this paper extends the term to handle surfaces in three dimensions. this is important for applications in which the colors of an object need to be restored and no relevant image exists for texturing it. We focus on surfaces withpatterns and propose a novel algorithm for adding colors to these surfaces. the user needs only to scribble a few color strokes on one instance of each pattern, and the system proceeds to automatically colorize the whole surface. For this scheme to work, we address not only the problem of colorization, but also the problem of pattern detection on surfaces.
Online dictionary learning is particularly useful for processing large-scale and dynamic data in computervision. It, however faces the major difficulty to incorporate robust functions, rather than the square data fit...
详细信息
ISBN:
(纸本)9780769549897
Online dictionary learning is particularly useful for processing large-scale and dynamic data in computervision. It, however faces the major difficulty to incorporate robust functions, rather than the square data fitting term, to handle outliers in training data. In this paper we propose a new online framework enabling the use of l(1) sparse data fitting term in robust dictionary learning, notably enhancing the usability and practicality of this important technique. Extensive experiments have been carried out to validate our new framework.
Many vision tasks require a multi-class classifier to discriminate multiple categories, on the order of hundreds or thousands. In this paper, we propose sparse output coding, a principled way for large-scale multi-cla...
详细信息
ISBN:
(纸本)9780769549897
Many vision tasks require a multi-class classifier to discriminate multiple categories, on the order of hundreds or thousands. In this paper, we propose sparse output coding, a principled way for large-scale multi-class classification, by turning high-cardinality multi-class categorization into a bit-by-bit decoding problem. Specifically, sparse output coding is composed of two steps: efficient coding matrix learning with scalability to thousands of classes, and probabilistic decoding. Empirical results on object recognition and scene classification demonstrate the effectiveness of our proposed approach.
Large-scale recognition problems withthousands of classes pose a particular challenge because applying the classifier requires more computation as the number of classes grows. the label tree model integrates classifi...
详细信息
ISBN:
(纸本)9780769549897
Large-scale recognition problems withthousands of classes pose a particular challenge because applying the classifier requires more computation as the number of classes grows. the label tree model integrates classification withthe traversal of the tree so that complexity grows logarithmically. In this paper we show how the parameters of the label tree can be found using maximum likelihood estimation. this new probabilistic learning technique produces a label tree with significantly improved recognition accuracy.
In this paper, we tackle the problem of performing inference in graphical models whose energy is a polynomial function of continuous variables. Our energy minimization method follows a dual decomposition approach, whe...
详细信息
ISBN:
(纸本)9780769549897
In this paper, we tackle the problem of performing inference in graphical models whose energy is a polynomial function of continuous variables. Our energy minimization method follows a dual decomposition approach, where the global problem is split into subproblems defined over the graph cliques. the optimal solution to these subproblems is obtained by making use of a polynomial system solver. Our algorithm inherits the convergence guarantees of dual decomposition. To speed up optimization, we also introduce a variant of this algorithm based on the augmented Lagrangian method. Our experiments illustrate the diversity of computervision problems that can be expressed with polynomial energies, and demonstrate the benefits of our approach over existing continuous inference methods.
We address the problem of person identification in TV series. We propose a unified learning framework for multi-class classification which incorporates labeled and unlabeled data, and constraints between pairs of feat...
详细信息
ISBN:
(纸本)9780769549897
We address the problem of person identification in TV series. We propose a unified learning framework for multi-class classification which incorporates labeled and unlabeled data, and constraints between pairs of features in the training. We apply the framework to train multinomial logistic regression classifiers for multi-class face recognition. the method is completely automatic, as the labeled data is obtained by tagging speaking faces using subtitles and fan transcripts of the videos. We demonstrate our approach on six episodes each of two diverse TV series and achieve state-of-the-art performance.
Recently active learning has attracted a lot of attention in computervision field, as it is time and cost consuming to prepare a good set of labeled images for vision data analysis. Most existing active learning appr...
详细信息
ISBN:
(纸本)9780769549897
Recently active learning has attracted a lot of attention in computervision field, as it is time and cost consuming to prepare a good set of labeled images for vision data analysis. Most existing active learning approaches employed in computervision adopt most uncertainty measures as instance selection criteria. Although most uncertainty query selection strategies are very effective in many circumstances, they fail to take information in the large amount of unlabeled instances into account and are prone to querying outliers. In this paper we present a novel adaptive active learning approach that combines an information density measure and a most uncertainty measure together to select critical instances to label for image classifications. Our experiments on two essential tasks of computervision, object recognition and scene recognition, demonstrate the efficacy of the proposed approach.
Estimating geographic location from images is a challenging problem that is receiving recent attention. In contrast to many existing methods that primarily model discriminative information corresponding to different l...
详细信息
ISBN:
(纸本)9780769549897
Estimating geographic location from images is a challenging problem that is receiving recent attention. In contrast to many existing methods that primarily model discriminative information corresponding to different locations, we propose joint learning of information that images across locations share and vary upon. Starting with generative and discriminative subspaces pertaining to domains, which are obtained by a hierarchical grouping of images from adjacent locations, we present a top-down approach that first models cross-domain information transfer by utilizing the geometry of these subspaces, and then encodes the model results onto individual images to infer their location. We report competitive results for location recognition and clustering on two public datasets, im2GPS and San Francisco, and empirically validate the utility of various design choices involved in the approach.
暂无评论