Recognizing categories of articulated objects in real-world scenarios is a challenging problem for today's vision algorithms. Due to the large appearance changes and intra-class variability of these objects, it is...
详细信息
ISBN:
(纸本)3540444122
Recognizing categories of articulated objects in real-world scenarios is a challenging problem for today's vision algorithms. Due to the large appearance changes and intra-class variability of these objects, it is hard to define a model, which is both general and discriminative enough to capture the properties of the category. In this work, we propose an approach, which aims for a suitable trade-off for this problem. On the one hand, the approach is made more discriminant by explicitly distinguishing typical object shapes. On the other hand, the method generalizes well and requires relatively few training samples by cross-articulation learning. the effectiveness of the approach is shown and compared to previous approaches on two datasets containing pedestrians with different articulations.
We present a novel model for object recognition and detection that follows the widely adopted assumption that objects in images can be represented as a set of loosely coupled parts. In contrast to former models, the p...
详细信息
ISBN:
(纸本)3540444122
We present a novel model for object recognition and detection that follows the widely adopted assumption that objects in images can be represented as a set of loosely coupled parts. In contrast to former models, the presented method can cope with an arbitrary number of object parts. Here, the object parts are modelled by image patches that are extracted at each position and then efficiently stored in a histogram. In addition to the patch appearance, the positions of the extracted patches are considered and provide a significant increase in the recognition performance. Additionally, a new and efficient histogram comparison method taking into account inter-bin similarities is proposed. the presented method is evaluated for the task of radiograph recognition where it achieves the best result published so far. Furthermore it yields very competitive results for the commonly used Caltech object detection tasks.
Patch based approaches have recently shown promising results for the recognition of visual object classes. this paper investigates the role of different properties of patches. In particular, we explore how size, locat...
详细信息
ISBN:
(纸本)3540444122
Patch based approaches have recently shown promising results for the recognition of visual object classes. this paper investigates the role of different properties of patches. In particular, we explore how size, location and nature of interest points influence recognition performance. Also, different feature types are evaluated. For our experiments we use three common databases at different levels of difficulty to make our statements more general. the insights given in the conclusion can serve as guidelines for developers of algorithms using image patches.
We focus on learning graphical models of object classes from arbitrary instances of objects. Large intra-class variability of object appearance is dealt with by combining statistical local part detection with relation...
详细信息
ISBN:
(纸本)3540444122
We focus on learning graphical models of object classes from arbitrary instances of objects. Large intra-class variability of object appearance is dealt with by combining statistical local part detection with relations between object parts in a probabilistic network. Inference for view-based object recognition is done either with A*-search employing a novel and dedicated admissible heuristic, or with Belief Propagation, depending on the network size. Our approach is applicable to arbitrary object classes. We validate this for "faces" and for "articulated humans". In the former case, our approach shows performance equal or superior to dedicated face recognition approaches. In the latter case, widely different poses and object appearances in front of cluttered backgrounds can be recognized.
A method for exploiting the information in low-level image segmentations for the purpose of object recognition is presented. the key idea is to use a whole ensemble of segmentations per image, computed on different ra...
详细信息
ISBN:
(纸本)3540444122
A method for exploiting the information in low-level image segmentations for the purpose of object recognition is presented. the key idea is to use a whole ensemble of segmentations per image, computed on different random samples of image sites. Along the boundaries of those segmentations that are stable under the sampling process we extract strings of vectors that contain local image descriptors like shape, texture and intensities. Pairs of such strings are aligned, and based on the alignment scores a mixture model is trained which divides the segments in an image into fore- and background. Given such candidate foreground segments, we show that it is possible to build a state-of-the-art object recognition system that exhibits excellent performance on a standard benchmark database. this result shows that despite the inherent problems of low-level image segmentation in poor data conditions, segmentation can indeed be a valuable tool for object recognition in real-world images.
We present a new method for planning the optimal next view for a probabilistic visual object tracking task. Our method uses a variable number of cameras, can plan an action sequence several time steps into the future,...
详细信息
ISBN:
(纸本)3540444122
We present a new method for planning the optimal next view for a probabilistic visual object tracking task. Our method uses a variable number of cameras, can plan an action sequence several time steps into the future, and allows for real-time usage due to a computation time which is linear both in the number of cameras and the number of time steps. the algorithm can also handle object loss in one, more or all cameras, interdependencies in the camera's information contribution, and variable action costs. We evaluate our method by comparing it to previous approaches with a prerecorded sequence of real world images.
We present an approach to non-rigid object tracking designed to handle textured objects in crowded scenes captured by non-static cameras. For this purpose, groups of low-level features are combined into a model descri...
详细信息
ISBN:
(纸本)3540444122
We present an approach to non-rigid object tracking designed to handle textured objects in crowded scenes captured by non-static cameras. For this purpose, groups of low-level features are combined into a model describing boththe shape and the appearance of the object. this results in remarkable robustness to severe partial occlusions, since overlapping objects are unlikely to be indistinguishable in appearance, configuration and velocity all at the same time. the model is learnt incrementally and adapts to varying illumination conditions and target shape and appearance, and is thus applicable to any kind of object. Results on real-world sequences demonstrate the performance of the proposed tracker. the algorithm is implemented withthe aim of achieving near real-time performance.
Different from many gesture-based human-robot interaction applications, which focused on the recognition of the interactional or the pointing gestures, this paper proposes a vision-based method for manipulative gestur...
详细信息
ISBN:
(纸本)3540444122
Different from many gesture-based human-robot interaction applications, which focused on the recognition of the interactional or the pointing gestures, this paper proposes a vision-based method for manipulative gesture recognition aiming to achieve natural, proactive, and non-intrusive interaction between humans and robots. the main contributions of the paper are an object-centered scheme for the segmentation and characterization of hand trajectory information, the use of particle filtering methods for an action primitive spotting, and the tight coupling of bottom-up and top-down processing that realizes a task-driven attention filter for low-level recognition steps. In contrast to purely trajectory based techniques, the presented approach is called object-oriented w.r.t. two different aspects: it is object-centered in terms of trajectory features that are defined relative to an object, and it uses object-specific models for action primitives. the system has a two-layer structure recognizing boththe HMM-modeled manipulative primitives and the underlying task characterized by the manipulative primitive sequence. the proposed top-down and bottom-up mechanism between the two layers decreases the image processing load and improves the recognition rate.
We present a model-based method for hand posture recognition in monocular image sequences that measures joint angles, viewing angle, and position in space. Visual markers in form of a colored cotton glove are used to ...
详细信息
ISBN:
(纸本)3540444122
We present a model-based method for hand posture recognition in monocular image sequences that measures joint angles, viewing angle, and position in space. Visual markers in form of a colored cotton glove are used to extract descriptive and stable 2D features. Searching a synthetically generated database of 2.6 million entries, each consisting of 3D hand posture parameters and the corresponding 2D features, yields several candidate postures per frame. this ambiguity is resolved by exploiting temporal continuity between successive frames. the method is robust to noise, can be used from any viewing angle, and places no constraints on the hand posture. Self-occlusion of any number of markers is handled. It requires no initialization and retrospectively corrects posture errors when accordant information becomes available. Besides a qualitative evaluation on real images, a quantitative performance measurement using a large amount of synthetic input data featuring various degrees of noise shows the effectiveness of the approach.
this paper presents a practical system for vision-based traffic scene analysis from a moving vehicle based on a cognitive feedback loop which integrates real-time geometry estimation with appearance-based object detec...
详细信息
ISBN:
(纸本)3540444122
this paper presents a practical system for vision-based traffic scene analysis from a moving vehicle based on a cognitive feedback loop which integrates real-time geometry estimation with appearance-based object detection. We demonstrate how those two components can benefit from each other's continuous input and how the transferred knowledge can be used to improve scene analysis. thus, scene interpretation is not left as a matter of logical reasoning, but is instead addressed by the repeated interaction and consistency checks between different levels and modes of visual processing. As our results show, the proposed tight integration significantly increases recognition performance, as well as overall system robustness. In addition, it enables the construction of novel capabilities such as the accurate 3D estimation of object locations and orientations and their temporal integration in a world coordinate frame. the system is evaluated on a challenging real-world car detection task in an urban scenario.
暂无评论