We frame the problem of local representation of imaging data as the computation of minimal sufficient statistics that are invariant to nuisance variability induced by viewpoint and illumination. We show that, under ve...
详细信息
ISBN:
(纸本)9781467369640
We frame the problem of local representation of imaging data as the computation of minimal sufficient statistics that are invariant to nuisance variability induced by viewpoint and illumination. We show that, under very stringent conditions, these are related to "feature descriptors" commonly used in computervision. Such conditions can be relaxed if multiple views of the same scene are available. We propose a sampling-based and a point-estimate based approximation of such a representation, compared empirically on image-to-(multiple) image matching, for which we introduce a multi-view wide-baseline matching benchmark, consisting of a mixture of real and synthetic objects with ground truth camera motion and dense three-dimensional geometry.
Curse of dimensionality is a practical and challenging problem in image categorization, especially in cases with a large number of classes. Multi-class classification encounters severe computational and storage proble...
详细信息
ISBN:
(纸本)9781479951178
Curse of dimensionality is a practical and challenging problem in image categorization, especially in cases with a large number of classes. Multi-class classification encounters severe computational and storage problems when dealing with these large scale tasks. In this paper, we propose hierarchical feature hashing to effectively reduce dimensionality of parameter space without sacrificing classification accuracy, and at the same time exploit information in semantic taxonomy among categories. We provide detailed theoretical analysis on our proposed hashing method. Moreover, experimental results on object recognition and scene classification further demonstrate the effectiveness of hierarchical feature hashing.
State-of-the-art motion estimation algorithms suffer from three major problems: Poorly textured regions, occlusions and small scale image structures. Based on the Gestalt principles of grouping we propose to incorpora...
详细信息
ISBN:
(纸本)9781424469840
State-of-the-art motion estimation algorithms suffer from three major problems: Poorly textured regions, occlusions and small scale image structures. Based on the Gestalt principles of grouping we propose to incorporate a low level image segmentation process in order to tackle these problems. Our new motion estimation algorithm is based on non-local total variation regularization which allows us to integrate the low level image segmentation process in a unified variational framework. Numerical results on the Middlebury optical flow benchmark data set demonstrate that we can cope with the aforementioned problems.
Two challenges in computervision are to accommodate noisy data and missing data. Many problems in computervision, such as segmentation, filtering, stereo, reconstruction, inpainting and optical flow seek solutions t...
详细信息
ISBN:
(纸本)9781424469840
Two challenges in computervision are to accommodate noisy data and missing data. Many problems in computervision, such as segmentation, filtering, stereo, reconstruction, inpainting and optical flow seek solutions that match the data while satisfying an additional regularization, such as total variation or boundary length. A regularization which has received less attention is to minimize the curvature of the solution. One reason why this regularization has received less attention is due to the difficulty in finding an optimal solution to this image model, since many existing methods are complicated, slow and/or provide a suboptimal solution. Following the recent progress of Schoenemann et al. [28], we provide a simple formulation of curvature regularization which admits a fast optimization which gives globally optimal solutions in practice. We demonstrate the effectiveness of this method by applying this curvature regularization to image segmentation.
Unsupervised image clustering methods often introduce alternative objectives to indirectly train the model and are subject to faulty predictions and overconfident results. To overcome these challenges, the current res...
详细信息
ISBN:
(纸本)9781665445092
Unsupervised image clustering methods often introduce alternative objectives to indirectly train the model and are subject to faulty predictions and overconfident results. To overcome these challenges, the current research proposes an innovative model RUC that is inspired by robust learning. RUC's novelty is at utilizing pseudo-labels of existing image clustering models as a noisy dataset that may include misclassified samples. Its retraining process can revise misaligned knowledge and alleviate the overconfidence problem in predictions. The model's flexible structure makes it possible to be used as an add-on module to other clustering methods and helps them achieve better performance on multiple datasets. Extensive experiments show that the proposed model can adjust the model confidence with better calibration and gain additional robustness against adversarial noise.
Inspired by the properties of the human visual system, a new active vision system called ESCHeR (Etl Stereo Compact Head For Robot vision) has been recently implemented with foveated wide-angle lenses. The lenses exhi...
详细信息
ISBN:
(纸本)0818672587
Inspired by the properties of the human visual system, a new active vision system called ESCHeR (Etl Stereo Compact Head For Robot vision) has been recently implemented with foveated wide-angle lenses. The lenses exhibit a wide field of view along with a space-varying resolution for facilitating both detection and close observation. However, to handle such optical properties and achieve basic eye movement functions, new calibration methods are needed. Therefore, two novel and online techniques are presented that in one case perform a global identification of the optical process through artificial neural techniques and in the other case compute the physical parameters by using environmental feature-tracking and controlled rotations of the cameras. Self-alignment of the cameras is also achieved using a similar technique.
Most convolutional neural networks (CNNs) lack midlevel layers that model semantic parts of objects. This limits CNN-based methods from reaching their full potential in detecting and utilizing small semantic parts in ...
详细信息
ISBN:
(纸本)9781467388511
Most convolutional neural networks (CNNs) lack midlevel layers that model semantic parts of objects. This limits CNN-based methods from reaching their full potential in detecting and utilizing small semantic parts in recognition. Introducing such mid-level layers can facilitate the extraction of part-specific features which can be utilized for better recognition performance. This is particularly important in the domain of fine-grained recognition. In this paper, we propose a new CNN architecture that integrates semantic part detection and abstraction (SPDA-CNN) for fine-grained classification. The proposed network has two sub-networks: one for detection and one for recognition. The detection sub-network has a novel top-down proposal method to generate small semantic part candidates for detection. The classification sub-network introduces novel part layers that extract features from parts detected by the detection sub-network, and combine them for recognition. As a result, the proposed architecture provides an end-to-end network that performs detection, localization of multiple semantic parts, and whole object recognition within one framework that shares the computation of convolutional filters. Our method outperforms state-of-the-art methods with a large margin for small parts detection (e.g. our precision of 93.40% vs the best previous precision of 74.00% for detecting the head on CUB-2011). It also compares favorably to the existing state-of-the-art on finegrained classification, e.g. it achieves 85.14% accuracy on CUB-2011.
Effective regularization techniques are highly desired in deep learning for alleviating overfitting and improving generalization. This work proposes a new regularization scheme, based on the understanding that the fla...
详细信息
ISBN:
(纸本)9781665445092
Effective regularization techniques are highly desired in deep learning for alleviating overfitting and improving generalization. This work proposes a new regularization scheme, based on the understanding that the flat local minima of the empirical risk cause the model to generalize better. This scheme is referred to as adversarial model perturbation (AMP), where instead of directly minimizing the empirical risk, an alternative "AMP loss" is minimized via SGD. Specifically, the AMP loss is obtained from the empirical risk by applying the "worst" norm-bounded perturbation on each point in the parameter space. Comparing with most existing regularization schemes, AMP has strong theoretical justifications, in that minimizing the AMP loss can be shown theoretically to favour flat local minima of the empirical risk. Extensive experiments on various modern deep architectures establish AMP as a new state of the art among regularization schemes.
In order to reduce false alarms and to improve the target detection performance of an automatic target detection and recognition system operating in a cluttered environment, it is important to develop the models not o...
详细信息
ISBN:
(纸本)0818672587
In order to reduce false alarms and to improve the target detection performance of an automatic target detection and recognition system operating in a cluttered environment, it is important to develop the models not only for man-made targets but also of natural background clutters. Because of the high complexity of natural clutters, this clutter model can only be reliably built through learning from real examples. If available, contextual information that characterizes each training example can be used to further improve the learned clutter model. In this paper, we present such a clutter model aided target detection system. Emphases are placed on two topics: (1) learning the background clutter model from sensory data through a self-organizing process, (2) reinforcing the learned clutter model using contextual information.
recognition and reconstruction of residential floor plan drawings are important and challenging in design, decoration, and architectural remodeling fields. An automatic framework is provided that accurately recognizes...
详细信息
ISBN:
(纸本)9781665445092
recognition and reconstruction of residential floor plan drawings are important and challenging in design, decoration, and architectural remodeling fields. An automatic framework is provided that accurately recognizes the structure, type, and size of the room, and outputs vectorized 3D reconstruction results. Deep segmentation and detection neural networks are utilized to extract room structural information. Key points detection network and cluster analysis are utilized to calculate scales of rooms. The vectorization of room information is processed through an iterative optimization-based method. The system significantly increases accuracy and generalization ability, compared with existing methods. It outperforms other systems in floor plan segmentation and vectorization process, especially inclined wall detection.
暂无评论