In this paper we analyze the classification performance of neural network structures without parametric inference. Making use of neural architecture search, we empirically demonstrate that it is possible to find rando...
详细信息
ISBN:
(纸本)9781665448994
In this paper we analyze the classification performance of neural network structures without parametric inference. Making use of neural architecture search, we empirically demonstrate that it is possible to find random weight architectures, a deep prior, that enables a linear classification to perform on par with fully trained deep counterparts. Through ablation experiments, we exclude the possibility of winning a weight initialization lottery and confirm that suitable deep priors do not require additional inference. In an extension to continual learning, we investigate the possibility of catastrophic interference free incremental learning. Under the assumption of classes originating from the same data distribution, a deep prior found on only a subset of classes is shown to allow discrimination of further classes through training of a simple linear classifier.
Tensor Voting is a robust technique to extract low-level features in noisy images. The approach achieves its robustness by exploiting coherent orientations in local neighborhoods. In this paper we propose an efficient...
详细信息
ISBN:
(纸本)9781424423392
Tensor Voting is a robust technique to extract low-level features in noisy images. The approach achieves its robustness by exploiting coherent orientations in local neighborhoods. In this paper we propose an efficient algorithm for dense Tensor Voting in 3D which makes use of steerable filters. Therefore, we propose steerable expansions of spherical tensor fields in terms of tensorial harmonics, which are their canonical representation. In this way it is possible to perform arbitrary rank Tensor Voting by linear combinations of convolutions in an efficient way.
In this paper, we study the problem of reproducing the light from a single image of an object covered with random specular microfacets on the surface. We show that such reflectors can be interpreted as a randomized ma...
详细信息
ISBN:
(纸本)9781467367592
In this paper, we study the problem of reproducing the light from a single image of an object covered with random specular microfacets on the surface. We show that such reflectors can be interpreted as a randomized mapping from the lighting to the image. Such specular objects have very different optical properties from both diffuse surfaces and smooth specular objects like metals, so we design a special imaging system to robustly and effectively photograph them. We present simple yet reliable algorithms to calibrate the proposed system and do the inference. We conduct experiments to verify the correctness of our model assumptions and prove the effectiveness of our pipeline.
We have designed and implemented a real-time binocular tracking system which uses two independent cues commonly found in the primary functions of biological visual systems to robustly track moving targets in complex e...
详细信息
ISBN:
(纸本)0780342364
We have designed and implemented a real-time binocular tracking system which uses two independent cues commonly found in the primary functions of biological visual systems to robustly track moving targets in complex environments, without a-priori knowledge of the target shape or texture: a fast optical flow segmentation algorithm quickly locates independently moving objects for target acquisition and provides a reliable velocity estimate for smooth tracking. In parallel, target position is generated from the output of a zero-disparity filter where a phase-based disparity estimation technique allows dynamic control of the camera vergence to adapt the horopter geometry to the target location. The system takes advantage of the optical properties of our custom-designed foveated wide-angle lenses, which exhibit a wide field of view along with a high resolution fovea. Methods to cope with the distortions introduced by the space-variant resolution, and a robust real-time implementation on a high performance active vision head are presented.
Despite the rapid progress in deep visual recognition, modern computervision datasets significantly overrepresent the developed world and models trained on such datasets underperform on images from unseen geographies...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
Despite the rapid progress in deep visual recognition, modern computervision datasets significantly overrepresent the developed world and models trained on such datasets underperform on images from unseen geographies. We investigate the effectiveness of unsupervised domain adaptation (UDA) of such models across geographies at closing this performance gap. To do so, we first curate two shifts from existing datasets to study the Geographical DA problem, and discover new challenges beyond data distribution shift: context shift, wherein object surroundings may change significantly across geographies, and subpopulation shift, wherein the intra-category distributions may shift. We demonstrate the inefficacy of standard DA methods at Geographical DA, highlighting the need for specialized geographical adaptation solutions to address the challenge of making object recognition work for everyone.
In this paper, we study deep transfer learning as a way of overcoming object recognition challenges encountered in the field of digital pathology. Through several experiments, we investigate various uses of pre-traine...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
In this paper, we study deep transfer learning as a way of overcoming object recognition challenges encountered in the field of digital pathology. Through several experiments, we investigate various uses of pre-trained neural network architectures and different combination schemes with random forests for feature selection. Our experiments on eight classification datasets show that densely connected and residual networks consistently yield best performances across strategies. It also appears that network fine-tuning and using inner layers features are the best performing strategies, with the former yielding slightly superior results.
We present a key point-based activity recognition framework, built upon pre-trained human pose estimation and facial feature detection models. Our method extracts complex static and movement-based features from key fr...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
We present a key point-based activity recognition framework, built upon pre-trained human pose estimation and facial feature detection models. Our method extracts complex static and movement-based features from key frames in videos, which are used to predict a sequence of key-frame activities. Finally, a merge procedure is employed to identify robust activity segments while ignoring outlier frame activity predictions. We analyze the different components of our framework via a wide array of experiments and draw conclusions with regards to the utility of the model and ways it can be improved. Results show our model is competitive, taking the 11th place out of 27 teams submitting to Track 3 of the 2022 AI City Challenge.
The analysis of human action captured in video sequences has been a topic of considerable interest in computervision. Much of the previous work has focused on the problem of action or activity recognition, but ignore...
详细信息
ISBN:
(纸本)0769506623
The analysis of human action captured in video sequences has been a topic of considerable interest in computervision. Much of the previous work has focused on the problem of action or activity recognition, but ignored the problem of detecting action boundaries in a video sequence containing unfamiliar and arbitrary visual actions. This paper presents an approach to this problem based on detecting temporal discontinuities of the spatial pattern of image motion that captures the action. We represent frame to frame optical-flow in terms of the coefficients of the most significant principal components computed from all the flow-fields within a given video sequence. We then detect the discontinuities in the temporal trajectories of these coefficients based on three different measures. We compare our segment boundaries against those detected by human observers on the same sequences in a recent independent psychological study of human perception of visual events. We show experimental results on the two sequences that were used in this study. Our experimental results are promising both from visual evaluation and when compared against the results of the psychological study.
Subspace representations have been a popular way to model appearance in computervision. In Jepson and Black's influential paper on EigenTracking, they were successfully applied in tracking. For noisy targets, opt...
详细信息
ISBN:
(纸本)0769521584
Subspace representations have been a popular way to model appearance in computervision. In Jepson and Black's influential paper on EigenTracking, they were successfully applied in tracking. For noisy targets, optimization-based algorithms (including EigenTracking) often fail catastrophically after losing track Particle filters have recently emerged as a robust method for tracking in the presence of multi-modal distributions. To use subspace representations in a particle filter, the number of samples increases exponentially as the state vector includes the subspace coefficients. We introduce an efficient method for using subspace representations in a particle filter by applying Rao-Blackwellization to integrate out the subspace coefficients in the state vector Fewer samples are needed since part of the posterior over the state vector is analytically calculated. We use probabilistic principal component analysis to obtain analytically tractable integrals. We show experimental results in a scenario in which we track a target in clutter.
In this paper we present and start analyzing the iCub World data-set, an object recognition data-set, we acquired using a Human-Robot Interaction (HRI) scheme and the iCub humanoid robot platform. Our set up allows fo...
详细信息
ISBN:
(纸本)9780769549903
In this paper we present and start analyzing the iCub World data-set, an object recognition data-set, we acquired using a Human-Robot Interaction (HRI) scheme and the iCub humanoid robot platform. Our set up allows for rapid acquisition and annotation of data with corresponding ground truth. While more constrained in its scopes - the iCub world is essentially a robotics research lab - we demonstrate how the proposed data-set poses challenges to current recognition systems. The iCubWorld data-set is publicly available (1).
暂无评论