We present Meta Pseudo Labels, a semi-supervised learning method that achieves a new state-of-the-art top-1 accuracy of 90.2% on ImageNet, which is 1.6% better than the existing state-of-the-art [16]. Like Pseudo Labe...
详细信息
ISBN:
(纸本)9781665445092
We present Meta Pseudo Labels, a semi-supervised learning method that achieves a new state-of-the-art top-1 accuracy of 90.2% on ImageNet, which is 1.6% better than the existing state-of-the-art [16]. Like Pseudo Labels, Meta Pseudo Labels has a teacher network to generate pseudo labels on unlabeled data to teach a student network. However, unlike Pseudo Labels where the teacher is fixed, the teacher in Meta Pseudo Labels is constantly adapted by the feedback of the student's performance on the labeled dataset. As a result, the teacher generates better pseudo labels to teach the student.(1)
This paper introduces a new method for object recognition which is based on a recurrent neural network trained in a supervised mode. The RNN inputs 3-dimensional laser scanner data sequentially, in a natural temporal ...
详细信息
ISBN:
(纸本)9781424439942
This paper introduces a new method for object recognition which is based on a recurrent neural network trained in a supervised mode. The RNN inputs 3-dimensional laser scanner data sequentially, in a natural temporal order in which the laser returns arrive to the scanner The method is illustrated on a two-class problem with real data.
In this paper, we tackle the problem of performing inference in graphical models whose energy is a polynomial function of continuous variables. Our energy minimization method follows a dual decomposition approach, whe...
详细信息
ISBN:
(纸本)9780769549897
In this paper, we tackle the problem of performing inference in graphical models whose energy is a polynomial function of continuous variables. Our energy minimization method follows a dual decomposition approach, where the global problem is split into subproblems defined over the graph cliques. The optimal solution to these subproblems is obtained by making use of a polynomial system solver. Our algorithm inherits the convergence guarantees of dual decomposition. To speed up optimization, we also introduce a variant of this algorithm based on the augmented Lagrangian method. Our experiments illustrate the diversity of computervision problems that can be expressed with polynomial energies, and demonstrate the benefits of our approach over existing continuous inference methods.
Feature indexing techniques are promising for object recognition since they can quickly reduce the set of possible matches for a set of image features. This work exploits another property of such techniques. They have...
详细信息
ISBN:
(纸本)0818672587
Feature indexing techniques are promising for object recognition since they can quickly reduce the set of possible matches for a set of image features. This work exploits another property of such techniques. They have inherently parallel structure and connectionist network formulations are easy to develop. Once indexing has been performed, a voting scheme such as geometric hashing can be used to generate object hypotheses in parallel. We describe a framework for the connectionist implementation of such indexing and recognition techniques. With sufficient processing elements, recognition can be performed in a small number of time steps. The number of processing elements necessary to achieve peak performance and the fan-in/fan-out required for the processing elements is examined. These techniques have been simulated on a conventional architecture with good results.
We propose a face recognition approach based on hashing. The approach yields comparable recognition rates with the random l(1) approach [18], which is considered the state-of-the-art. But our method is much faster: it...
详细信息
ISBN:
(纸本)9781424469840
We propose a face recognition approach based on hashing. The approach yields comparable recognition rates with the random l(1) approach [18], which is considered the state-of-the-art. But our method is much faster: it is up to 150 times faster than [18] on the YaleB dataset. We show that with hashing, the sparse representation can be recovered with a high probability because hashing preserves the restrictive isometry property. Moreover, we present a theoretical analysis on the recognition rate of the proposed hashing approach. Experiments show a very competitive recognition rate and significant speedup compared with the state-of-the-art.
We present a method to separate a single image captured under two illuminants, with different spectra, into the two images corresponding to the appearance of the scene under each individual illuminant. We do this by t...
详细信息
ISBN:
(纸本)9781728132938
We present a method to separate a single image captured under two illuminants, with different spectra, into the two images corresponding to the appearance of the scene under each individual illuminant. We do this by training a deep neural network to predict the per-pixel reflectance chromaticity of the scene, which we use in a physics-based image separation framework to produce the desired two output images. We design our reflectance chromaticity network and loss functions by incorporating intuitions from the physics of image formation. We show that this leads to significantly better performance than other single image techniques and even approaches the quality of the prior work that require additional images.
Real-time recognition may be limited by scarce memory and computing resources for performing classification. Although, prior research has addressed the problem of training classifiers with limited data and computation...
详细信息
ISBN:
(纸本)9781467312288
Real-time recognition may be limited by scarce memory and computing resources for performing classification. Although, prior research has addressed the problem of training classifiers with limited data and computation, few efforts have tackled the problem of memory constraints on recognition. We explore methods that can guide the allocation of limited storage resources for classifying streaming data so as to maximize discriminatory power. We focus on computation of the expected value of information with nearest neighbor classifiers for online face recognition. Experiments on real-world datasets show the effectiveness and power of the approach. The methods provide a principled approach to vision under bounded resources, and have immediate application to enhancing recognition capabilities in consumer devices with limited memory.
Towards the goal of realizing a generic automatic human activity recognition system, a new formalism is proposed. Activities are described by a chained hierarchical representation using three type of entities: image f...
详细信息
ISBN:
(纸本)0769506623
Towards the goal of realizing a generic automatic human activity recognition system, a new formalism is proposed. Activities are described by a chained hierarchical representation using three type of entities: image features, mobile object properties and scenarios. Taking image features of tracked moving regions from an image sequence as input, mobile object properties are first computed by specific methods ods while noise is suppressed by statistical methods. Scenarios are recognized from mobile object properties based on Bayesian analysis. A sequential occurance several scenarios are recognized by an algorithm using a probabilistic finite-state automation (a variant of structured HMM). The demonstration of the optimality of these recognition method is discussed. Finally, the validity and the effectiveness of our approach is demonstrated on both real-world and perturbed data.
A fundamental problem in depth from defocus is the measurement of relative defocus between images. We propose a class of broadband operators that, when used together, provide invariance to scene texture and produce ac...
详细信息
ISBN:
(纸本)0818672587
A fundamental problem in depth from defocus is the measurement of relative defocus between images. We propose a class of broadband operators that, when used together, provide invariance to scene texture and produce accurate and dense depth maps. Since the operators are broadband, a small number of them are sufficient for depth estimation of scenes with complex textural properties. Experiments are conducted on both synthetic and real scenes to evaluate the performance of the proposed operators. The depth detection gain error is less than 1%, irrespective of texture frequency. Depth accuracy is found to be 0.5 approx. 1.2% of the distance of the object from the imaging optics.
We represent local spatial structure in a color image using feature matrices that are computed from an image region. Feature matrices contain significantly more information about local image structure than previous re...
详细信息
ISBN:
(纸本)0818672587
We represent local spatial structure in a color image using feature matrices that are computed from an image region. Feature matrices contain significantly more information about local image structure than previous representations. Although feature matrices are useful for surface recognition, this representation depends on the spectral properties of the scene illumination. Using a finite dimensional linear model for surface spectral reflectance with the same number of parameters as the number of color bands, we show that illumination changes correspond to linear transformations of the feature matrices and that surface rotations correspond to circular shifts of the matrices. From these relationships we derive an algorithm for illumination and geometry invariant recognition of local surface structure. We demonstrate the algorithm with a series of experiments on images of real objects.
暂无评论