Can humans fly? Emphatically no. Can cars eat? Again, absolutely not. Yet, these absurd inferences result from the current disregard for particular types of actors in action understanding. There is no work we know of ...
详细信息
ISBN:
(纸本)9781467369640
Can humans fly? Emphatically no. Can cars eat? Again, absolutely not. Yet, these absurd inferences result from the current disregard for particular types of actors in action understanding. There is no work we know of on simultaneously inferring actors and actions in the video, not to mention a dataset to experiment with. Our paper hence marks the first effort in the computervision community to jointly consider various types of actors undergoing various actions. To start with the problem, we collect a dataset of 3782 videos from YouTube and label both pixel-level actors and actions in each video. We formulate the general actor action understanding problem and instantiate it at various granularities: both video-level single- and multiple label actor-action recognition and pixel-level actor-action semantic segmentation. Our experiments demonstrate that inference jointly over actors and actions outperforms inference independently over them, and hence concludes our argument of the value of explicit consideration of various actors in comprehensive action understanding.
This paper introduces a unified approach to the problem of verifying Alignment hypotheses in the presence of substantial amounts of uncertainty in the predicted locations of projected model features. Our approach is i...
详细信息
ISBN:
(纸本)0780342364
This paper introduces a unified approach to the problem of verifying Alignment hypotheses in the presence of substantial amounts of uncertainty in the predicted locations of projected model features. Our approach is independent of whether the uncertainty is distributed or bounded, and, moreover, incorporates information about the domain in a formally correct manner. Information which can be incorporated includes the error model, the distribution of background features, and the positions of the data features near each predicted model feature. Experiments are described that demonstrate the improvement over previously used methods. Furthermore, our method is efficient in that the number of operations is on the order of the number of image features that lie nearby the predicted model features.
We present a new approach for resolving occlusions in augmented reality. The main interest is that it does not require 3D reconstruction of the considered scene. Our idea is to use a contour based approach and to labe...
详细信息
ISBN:
(纸本)0780342364
We present a new approach for resolving occlusions in augmented reality. The main interest is that it does not require 3D reconstruction of the considered scene. Our idea is to use a contour based approach and to label each contour point as being ''behind'' or ''in front of'', depending on whether it is in front of or behind the virtual object. This labeling step only requires that the contours can be tracked from frame to frame. A proximity graph is then built in order to group the contours that belong to the same occluding object. Finally, we use some kind of active contours to accurately recover the mask of the occluding object.
The fact that image data samples lie on a manifold has been successfully exploited in many learning and inference problems. In this paper we leverage the specific structure of data in order to improve recognition accu...
详细信息
ISBN:
(纸本)9781467369640
The fact that image data samples lie on a manifold has been successfully exploited in many learning and inference problems. In this paper we leverage the specific structure of data in order to improve recognition accuracies in general recognition tasks. In particular we propose a novel framework that allows to embed manifold priors into sparse representation-based classification (SRC) approaches. We also show that manifold constraints can be transferred from the data to the optimized variables if these are linearly correlated. Using this new insight, we define an efficient alternating direction method of multipliers (ADMM) that can consistently integrate the manifold constraints during the optimization process. This is based on the property that we can recast the problem as the projection over the manifold via a linear embedding method based on the Geodesic distance. The proposed approach is successfully applied on face, digit, action and objects recognition showing a consistently increase on performance when compared to the state of the art.
Recently active learning has attracted a lot of attention in computervision field, as it is time and cost consuming to prepare a good set of labeled images for vision data analysis. Most existing active learning appr...
详细信息
ISBN:
(纸本)9780769549897
Recently active learning has attracted a lot of attention in computervision field, as it is time and cost consuming to prepare a good set of labeled images for vision data analysis. Most existing active learning approaches employed in computervision adopt most uncertainty measures as instance selection criteria. Although most uncertainty query selection strategies are very effective in many circumstances, they fail to take information in the large amount of unlabeled instances into account and are prone to querying outliers. In this paper we present a novel adaptive active learning approach that combines an information density measure and a most uncertainty measure together to select critical instances to label for image classifications. Our experiments on two essential tasks of computervision, object recognition and scene recognition, demonstrate the efficacy of the proposed approach.
This paper addresses the problem of estimating the epipolar geometry from point correspondences between two images taken by uncalibrated perspective cameras. It is shown that Jepson's and Heeger's linear subsp...
详细信息
ISBN:
(纸本)0818672587
This paper addresses the problem of estimating the epipolar geometry from point correspondences between two images taken by uncalibrated perspective cameras. It is shown that Jepson's and Heeger's linear subspace technique for infinitesimal motion estimation can be generalized to the finite motion case by choosing an appropriate basis for projective space. This yields a linear method for weak calibration. The proposed algorithm has been implemented and tested on both real and synthetic images, and it is compared to other linear and non-linear approaches to weak calibration.
We propose a novel grayness index forfinding gray pixels and demonstrate its effectiveness and efficiency in illumination estimation. The grayness index, GI in short, is derived using the Dichromatic Reflection Model ...
详细信息
ISBN:
(纸本)9781728132938
We propose a novel grayness index forfinding gray pixels and demonstrate its effectiveness and efficiency in illumination estimation. The grayness index, GI in short, is derived using the Dichromatic Reflection Model and is learning-free. GI allows to estimate one or multiple illuminationsources in color-biasedimages. On standardsingleillumination and multiple-illumination estimation benchmarks, GI outperforms state-of-the-art statisticalmethods and many recent deep methods. GI is simple andfast, written in afew dozen lines of code, processing a 1080p image in - 0.4 seconds with a non-optimized Matlab code.
The Perseus system is a purposive visual architecture that has been used to recognize the pointing gesture. recognition of this gesture is an important part of natural human-machine interfaces. Perseus is modularized ...
详细信息
ISBN:
(纸本)0818672587
The Perseus system is a purposive visual architecture that has been used to recognize the pointing gesture. recognition of this gesture is an important part of natural human-machine interfaces. Perseus is modularized into 6 types of components: feature maps, object representations, markers, visual routines, a segmentation map, and a long term visual memory. This structure not only allows Perseus to use knowledge about the task and environment at every stage of processing to more efficiently and accurately solve the pointing task, but also allows it to be extended to tasks other than recognizing pointing.
recognition ambiguity, due to noisy measurements and uncertain object models, can be quantified and actively used by an autonomous agent to efficiently gather new data and improve its information about the environment...
详细信息
ISBN:
(纸本)0818672587
recognition ambiguity, due to noisy measurements and uncertain object models, can be quantified and actively used by an autonomous agent to efficiently gather new data and improve its information about the environment. In this work an information-based utility measure is used to derive from a learned classification of shape models an efficient data collection strategy, specifically aimed at increasing classification confidence when recognizing uncertain shapes. Promising simulation results are presented and discussed.
We address the problem of locating a gray-level pattern in a gray-level image. The pattern can have been transformed formed by an affine transformation, and may have undergone some additional changes. We define a diff...
详细信息
ISBN:
(纸本)0780342364
We address the problem of locating a gray-level pattern in a gray-level image. The pattern can have been transformed formed by an affine transformation, and may have undergone some additional changes. We define a difference function based on comparing each pixel of the pattern with a window: in the image, and search efficiently for transformations that minimise the difference function. The search is guaranteed: it will always find the transformation minimising the difference function, and not get fooled by a local minimum;it is also efficient, in that it does not need to examine every transformation in order to achieve this guarantee. This technique can be applied to object location, motion tracking, optical flow, or block-based motion compensation in video image sequence compression (e.g., MPEG).
暂无评论