Understanding and interpreting dynamic scenes and activities is a very challenging problem. In this paper, we present a system capable of learning robot tasks from demonstration. Classical robot task programming requi...
详细信息
Understanding and interpreting dynamic scenes and activities is a very challenging problem. In this paper, we present a system capable of learning robot tasks from demonstration. Classical robot task programming requires an experienced programmer and a lot of tedious work. In contrast, programming by demonstration is a flexible framework that reduces the complexity of programming robot tasks, and allows end-users to demonstrate the tasks instead of writing code. We present our recent steps towards this goal. A system for learning pick-and-place tasks by manually demonstrating them is presented. Each demonstrated task is described by an abstract model involving a set of simple tasks such as what object is moved, where it is moved, and which grasp type was used to move it
Effective methods for recognising objects or spatio-temporal events can be constructed based on receptive field responses summarised into histograms or other histogram-like image descriptors. This work presents a set ...
详细信息
Effective methods for recognising objects or spatio-temporal events can be constructed based on receptive field responses summarised into histograms or other histogram-like image descriptors. This work presents a set of composed histogram features of higher dimensionality, which give significantly better recognition performance compared to the histogram descriptors of lower dimensionality that were used in the original papers by Swain & Bollard (1991) or Schiele & Crowley (2000). The use of histograms of higher dimensionality is made possible by a sparse representation for efficient computation and handling of higher-dimensional histograms. Results of extensive experiments are reported, showing how the performance of histogram-based recognition schemes depend upon different combinations of cues, in terms of Gaussian derivatives or differential invariants applied to either intensity information, chromatic information or both. It is shown that there exist composed higher-dimensional histogram descriptors with much better performance for recognising known objects than previously used histogram features. Experiments are also reported of classifying unknown objects into visual categories.
Object recognition systems aiming to work in real world settings should use multiple cues in order to achieve robustness. We present a new cue integration scheme, which extends the idea of cue accumulation to discrimi...
详细信息
Object recognition systems aiming to work in real world settings should use multiple cues in order to achieve robustness. We present a new cue integration scheme, which extends the idea of cue accumulation to discriminative classifiers. We derive and test the scheme for support vector machines (SVMs), but we also show that it is easily extendible to any large margin classifier. In the case of one-class SVMs the scheme can be interpreted as a new class of Mercer kernels for multiple cues. Experimental comparison with a probabilistic accumulation scheme is favorable to our method. Comparison with voting scheme shows that our method may suffer as the num.er of object classes increases. Based on these results, we propose a recognition algorithm consisting of a decision tree where decisions at each node are taken using our accumulation scheme. Results obtained using this new algorithm compare very favorably to accumulation (both probabilistic and discriminative) and voting scheme.
The notion of local features in space-time has recently been proposed to capture and describe local events in video. When computing space-time descriptors, however, the result may strongly depend on the relative motio...
详细信息
The notion of local features in space-time has recently been proposed to capture and describe local events in video. When computing space-time descriptors, however, the result may strongly depend on the relative motion between the object and the camera. To compensate for this variation, we present a method that automatically adapts the features to the local velocity of the image pattern and, hence, results in a video representation that is stable with respect to different amounts of camera motion. Experimentally we show that the use of velocity adaptation substantially increases the repeatability of interest points as well as the stability of their associated descriptors. Moreover, for an application to human action recognition we demonstrate how velocity-adapted features enable recognition of human actions in situations with unknown camera motion and complex, non-stationary backgrounds.
Local space-time features capture local events in video and can be adapted to the size, the frequency and the velocity of moving patterns. In this paper, we demonstrate how such features can be used for recognizing co...
详细信息
Local space-time features capture local events in video and can be adapted to the size, the frequency and the velocity of moving patterns. In this paper, we demonstrate how such features can be used for recognizing complex motion patterns. We construct video representations in terms of local space-time features and integrate such representations with SVM classification schemes for recognition. For the purpose of evaluation we introduce a new video database containing 2391 sequences of six human actions performed by 25 people in four different scenarios. The presented results of action recognition justify the proposed method and demonstrate its advantage compared to other relative approaches for action recognition.
This paper presents a set of image operators for detecting regions in space-time where interesting events occur. To define such regions of interest, we compute a spatio-temporal second-moment matrix from a spatio-temp...
详细信息
This paper presents a set of image operators for detecting regions in space-time where interesting events occur. To define such regions of interest, we compute a spatio-temporal second-moment matrix from a spatio-temporal scale-space representation, and diagonalize this matrix locally, using a local Galilean transformation in space-time, optionally combined with a spatial rotation, so as to make the Galilean invariant degrees of freedom explicit. From the Galilean-diagonalized descriptor so obtained, we then formulate different types of space-time interest operators, and illustrate their properties on different types of image sequences.
Abstract. In recent work, we presented a framework for many-to-many matching of multi-scale feature hierarchies, in which features and their relations were captured in a vertex-labeled, edge-weighted directed graph. T...
详细信息
Scale-space feature hierarchies can be conveniently represented as graphs, in which edges are directed from coarser features to finer features. Consequently, feature matching (or view-based object matching) can be for...
详细信息
A system using visual information to interact with its environment, e.g. a robot, needs to process an enormous amount of data. To ensure that the visual process has tractable complexity visual attention plays an impor...
详细信息
A robust, iterative approach is introduced for finding the dominant plane in a scene using binocular vision. Neither camera calibration nor stereo correspondence is required. Recently L. Cohen (1996) formalized a fram...
详细信息
A robust, iterative approach is introduced for finding the dominant plane in a scene using binocular vision. Neither camera calibration nor stereo correspondence is required. Recently L. Cohen (1996) formalized a framework guaranteeing (local) convergence of iterative two-step methods. In this paper, the framework is adopted, with a global step using tentative matches to estimate the planar projectivity, and a local step attempting to solve the stereo correspondence. A detected point in the first image is matched to an auxiliary point in the second image, on the line joining the transformed first image point, and its closest detected second image point. Convergence is assured, while achieving robustness to both mismatching and non-coplanar points.
暂无评论