A local parallel method is described for computing the stochastic completion field introduced in an earlier report. The local parallel method can be interpreted as a stable finite difference scheme for solving the und...
详细信息
ISBN:
(纸本)0818672587
A local parallel method is described for computing the stochastic completion field introduced in an earlier report. The local parallel method can be interpreted as a stable finite difference scheme for solving the underlying Fokker-Planck equation identified by Mumford. The new method is more plausible as a neural model since (1) unlike the previous method, it can be computed in a sparse, locally connected network;and (2) the network dynamics are consistent with psycophysical measurements of the time course of illusory contour formation.
Recently, deep learning approaches have demonstrated remarkable progresses for action recognition in videos. Most existing deep frameworks equally treat every volume i.e. spatial-temporal video clip, and directly assi...
详细信息
ISBN:
(纸本)9781467388511
Recently, deep learning approaches have demonstrated remarkable progresses for action recognition in videos. Most existing deep frameworks equally treat every volume i.e. spatial-temporal video clip, and directly assign a video label to all volumes sampled from it. However, within a video, discriminative actions may occur sparsely in a few key volumes, and most other volumes are irrelevant to the labeled action category. Training with a large proportion of irrelevant volumes will hurt performance. To address this issue, we propose a key volume mining deep framework to identify key volumes and conduct classification simultaneously. Specifically, our framework is trained is optimized in an alternative way integrated to the forward and backward stages of Stochastic Gradient Descent (SGD). In the forward pass, our network mines key volumes for each action class. In the backward pass, it updates network parameters with the help of these mined key volumes. In addition, we propose "Stochastic out" to model key volumes from multi-modalities, and an effective yet simple "unsupervised key volume proposal" method for high quality volume sampling. Our experiments show that action recognition performance can be significantly improved by mining key volumes, and we achieve state-of-the-art per-formance on HMDB51 and UCF101 (93.1%).
Current methods for registering image regions perform well for simple transformations or large image regions. In this paper, we present a new method that is better able to handle small image regions as they deform wit...
详细信息
ISBN:
(纸本)0780342364
Current methods for registering image regions perform well for simple transformations or large image regions. In this paper, we present a new method that is better able to handle small image regions as they deform with non-linear transformations. We introduce difference decompositon, a novel approach to solving the registration problem. The method is a generalization of previous methods and can better handle non-linear transforms. Although the methods are general, we focus on projective transformations and introduce piecewise-projective transformations for modeling the motions of non-planar objects. We conclude with examples from our prototype implementation.
Deep learning with 3D data such as reconstructed point clouds and CAD models has received great research interests recently. However, the capability of using point clouds with convolutional neural network has been so ...
详细信息
ISBN:
(纸本)9781538664209
Deep learning with 3D data such as reconstructed point clouds and CAD models has received great research interests recently. However, the capability of using point clouds with convolutional neural network has been so far not fully explored. In this paper, we present a convolutional neural network for semantic segmentation and object recognition with 3D point clouds. At the core of our network is pointwise convolution, a new convolution operator that can be applied at each point of a point cloud. Our fully convolutional network design, while being surprisingly simple to implement, can yield competitive accuracy in both semantic segmentation and object recognition task.
e propose a two-level system for apparent age estimation from facial images. Our system first classifies samples into overlapping age groups. Within each group, the apparent age is estimated with local regressors, who...
详细信息
ISBN:
(纸本)9781509014378
e propose a two-level system for apparent age estimation from facial images. Our system first classifies samples into overlapping age groups. Within each group, the apparent age is estimated with local regressors, whose outputs are then fused for the final estimate. We use a deformable parts model based face detector, and features from a pre-trained deep convolutional network. Kernel extreme learning machines are used for classification. We evaluate our system on the ChaLearn Looking at People 2016 - Apparent Age Estimation challenge dataset, and report 0.3740 normal score on the sequestered test set.
We develop a simple and very fast method for object tracking based exclusively on color information in digitized video images. Running on a Silicon Graphics R4600 Indy system with an IndyCam, our algorithm is capable ...
详细信息
ISBN:
(纸本)0780342364
We develop a simple and very fast method for object tracking based exclusively on color information in digitized video images. Running on a Silicon Graphics R4600 Indy system with an IndyCam, our algorithm is capable of simultaneously tracking objects at full frame size (640 x 480 pixels) and video frame rate (30 fps). Robustness with respect to occlusion is achieved via an explicit hypothesis-tree model of the occlusion process. We demonstrate the efficacy of our technique in the challenging task of tracking people, especially tracking human heads and hands.
Zero-shot learning (ZSL) aims to recognize unseen image categories by learning an embedding space between image and semantic representations. For years, among existing works, it has been the center task to learn the p...
详细信息
ISBN:
(纸本)9781538664209
Zero-shot learning (ZSL) aims to recognize unseen image categories by learning an embedding space between image and semantic representations. For years, among existing works, it has been the center task to learn the proper mapping matrices aligning the visual and semantic space, whilst the importance to learn discriminative representations for ZSL is ignored. In this work, we retrospect existing methods and demonstrate the necessity to learn discriminative representations for both visual and semantic instances of ZSL. We propose an end-to-end network that is capable of 1) automatically discovering discriminative regions by a zoom network;and 2) learning discriminative semantic representations in an augmented space introduced for both user-defined and latent attributes. Our proposed method is tested extensively on two challenging ZSL datasets, and the experiment results show that the proposed method significantly outperforms state-of-the-art methods.
In the depth from defocus (DFD) method two defocused images of a scene are obtained by capturing the scene with different sets of camera parameters. An arbitrary selection of the camera settings can result in observed...
详细信息
ISBN:
(纸本)0780342364
In the depth from defocus (DFD) method two defocused images of a scene are obtained by capturing the scene with different sets of camera parameters. An arbitrary selection of the camera settings can result in observed images whose relative blurring is insufficient to yield a good estimate of the depth. In this paper, we study the effect of the degree of relative blurring on the accuracy of the estimate of the depth by addressing the DFD problem in a maximum likelihood-based framework. We propose a criterion for optimal selection of camera parameters to obtain an improved estimate of the depth. The optimality criterion is based on the Cramer-Rao bound of the variance of the error in the estimate of blur. Simulations as well as experimental results on real images are presented for validation.
We propose an iterative method for estimating rigid transformations from point sets using adiabatic quantum computation. Compared to existing quantum approaches, our method relies on an adaptive scheme to solve the pr...
详细信息
ISBN:
(纸本)9781665469463
We propose an iterative method for estimating rigid transformations from point sets using adiabatic quantum computation. Compared to existing quantum approaches, our method relies on an adaptive scheme to solve the problem to high precision, and does not suffer from inconsistent rotation matrices. Experimentally, our method performs robustly on several 2D and 3D datasets even with high outlier ratio.
In this paper, we present a new approach to extract characters on a license plate of a moving vehicle given a sequence of perspective distortion corrected license plate images. We model the extraction of characters as...
详细信息
ISBN:
(纸本)0780342364
In this paper, we present a new approach to extract characters on a license plate of a moving vehicle given a sequence of perspective distortion corrected license plate images. We model the extraction of characters as a Markov random field (MRF), With the MRF modeling, the extraction of characters is formulated as the problem of maximizing the a posteriori probability based on given prior and observations. A genetic algorithm with local greedy mutation operator is employed do optimize the objective function. Experiments and comparison study were conducted. It is shown that our approach provides better performance than other single frame methods.
暂无评论