Temporal segmentation of human motion into actions is central to the understanding and building of computational models of human motion and activity recognition. Several issues contribute to the challenge of temporal ...
详细信息
ISBN:
(纸本)9781424439942
Temporal segmentation of human motion into actions is central to the understanding and building of computational models of human motion and activity recognition. Several issues contribute to the challenge of temporal segmentation and classification of human motion. These include the large variability in the temporal scale and Periodicity of human actions, the complexity of representing articulated motion, and the exponential nature of all possible movement combinations. We provide initial results from investigating two distinct problems - classification of the overall task being performed, and the more difficult problem of classifying individual frames over time into specific actions. We explore first-person sensing through a wearable camera and Inertial Measurement Units (IMUs)for temporally segmenting human motion into actions and performing activity classification in the context of cooking and recipe preparation in a natural environment. We present baseline results for supervised and unsupervised temporal segmentation, and recipe recognition in the CMU-Multimodal activity database (CMU-MMAC).
We demonstrate that is it possible to automatically find representative example images of a specified object category These canonical examples are perhaps the kind of images that one would show a child to teach them w...
详细信息
ISBN:
(纸本)9781424439942
We demonstrate that is it possible to automatically find representative example images of a specified object category These canonical examples are perhaps the kind of images that one would show a child to teach them what, for example a horse is - images with a large object clearly separated from the background. Given a large collection of images returned by a web search for an object category, our approach proceeds without an), user supplied training data for the category. First images are ranked according to a category independent composition model that predicts whether the), contain a large clearly depicted object, and outputs an estimated location of that object. Then local features calculated on the proposed object regions are used to eliminate images not distinctive to the category, and to cluster images by similarity of object appearance. We present results and a user evaluation on a variety of object categories, demonstrating the effectiveness of the approach.
The number of digital images that needs to be acquired, analyzed, classified, stored and retrieved in the medical centers is exponentially growing with the advances in medical imaging technologic Accordingly medical i...
详细信息
ISBN:
(纸本)9781424439942
The number of digital images that needs to be acquired, analyzed, classified, stored and retrieved in the medical centers is exponentially growing with the advances in medical imaging technologic Accordingly medical image classification and retrieval has become a popular topic in the recent years. Despite many projects,focusing on this problem, proposed solutions are still far from being sufficiently accurate for real-life implementations. Interpreting medical image classification and retrieval as a multi-class classification task, in this work, we investigate the performance of five different feature types in a SVM-based learning framework-for classification of human body X-Ray images into classes corresponding to body parts. Our comprehensive experiments,show that four conventional feature types provide performances comparable to the literature with low per-class accuracies, whereas local binary patterns produce not only very good global accuracy but also good class-specific accuracies with respect to the features used in the literature.
An algorithm is proposed for the 3D modeling of static scenes solely based on the range and intensity data acquired by a Time-of-Flight camera during an arbitrary movement. No additional scene acquisition devices, lik...
详细信息
ISBN:
(纸本)9781424439942
An algorithm is proposed for the 3D modeling of static scenes solely based on the range and intensity data acquired by a Time-of-Flight camera during an arbitrary movement. No additional scene acquisition devices, like inertia sensor, positioning robots or intensity based cameras are incorporated. The current pose is estimated by maximizing the uncentered correlation coefficient between edges detected in the current and a preceding frame at a minimum frame rate of four fps and an average accuracy of 45 mm. The paper also describes several extensions for robust registration like multiresolution hierarchies and projection Iterative Closest Point algorithm. The basic registration algorithm and its extensions were intensively evaluated against ground truth data to validate the accuracy, robustness and real-time-capability.
We propose an adaptive and effective multimodal peripheral-fovea sensor design for real-time targets tracking. This design is inspired by the biological vision systems for achieving real-time target detection and reco...
详细信息
ISBN:
(纸本)9781424439942
We propose an adaptive and effective multimodal peripheral-fovea sensor design for real-time targets tracking. This design is inspired by the biological vision systems for achieving real-time target detection and recognition with a hyperspectral/range fovea and panoramic peripheral view. A realistic scene simulation approach is used to evaluate our sensor design and the related data exploitation algorithms before a real sensor is made. The goal is to reduce development time and system cost while achieving optimal results through an iterative process that incorporates simulation, sensing, processing and evaluation. Important issues such as multimodal sensory component integration, region of interest extraction, target tracking, hyperspectral image analysis and target signature identification are discussed.
In this paper we address the problem of localisation and recognition of human activities in unsegmented image sequences. The main contribution of the proposed method is the use of an implicit representation of the spa...
详细信息
ISBN:
(纸本)9781424439942
In this paper we address the problem of localisation and recognition of human activities in unsegmented image sequences. The main contribution of the proposed method is the use of an implicit representation of the spatiotemporal shape of the activity which relies on the spatiotemporal localization of characteristic, sparse, 'visual words' and 'visual verbs'. Evidence for the spatiotemporal localization of the activity are accumulated in a probabilistic spatiotemporal voting scheme. The local nature of our voting framework allows us to recover multiple activities that take place in the same scene, as well as activities in the presence of clutter and occlusions. We construct class-specific codebooks using the descriptors in the training set, where we take the spatial co-occurrences of pairs of codewords into account. The positions of the codeword pairs with respect to the object centre, as well as the frame in the training set in which they occur are subsequently stored in order to create a spatiotemporal model of codeword co-occurrences. During the testing phase, we use Mean Shift Mode estimation in order to spatially segment the subject that performs the activities in every frame, and the Radon transform in order to extract the most probable hypotheses concerning the temporal segmentation of the activities within the continuous stream.
Images of an object undergoing ego- or camera- motion often appear to be scaled, rotated, and deformed versions of each other To detect and match such distorted patterns to a single sample view of the object requires ...
详细信息
ISBN:
(纸本)9781424439928
Images of an object undergoing ego- or camera- motion often appear to be scaled, rotated, and deformed versions of each other To detect and match such distorted patterns to a single sample view of the object requires solving a hard computational problem that has eluded most object matching methods. We propose a linear formulation that simultaneously finds feature point correspondences and global geometrical transformations in a constrained solution space. Further reducing the search space based on the lower convex hull property of the formulation, our method scales well with the number of candidate features. Our results on a variety of images and videos demonstrate that our method is accurate, efficient, and robust over local deformation, occlusion, clutter and large geometrical transformations.
We present a general technique for rectification of a stereo pair acquired by a calibrated omnidirectional camera. Using this technique we formulate a new stereographic rectification method. Our rectification does not...
详细信息
ISBN:
(纸本)9781424439928
We present a general technique for rectification of a stereo pair acquired by a calibrated omnidirectional camera. Using this technique we formulate a new stereographic rectification method. Our rectification does not map epipolar curves onto lines as common rectification methods, but rather maps epipolar curves onto circles. We show that this rectification in a certain sense minimizes the distortion of the original omnidirectional images. We formulate the rectification for multiple images and show that the choice of the optimal projection center of the rectification is under certain circumstances equivalent to the classical problem of spherical minimax location. We demonstrate the behaviour and the quality of the rectification in real experiments with images from 180 degree field of view fish eye lenses.
We present a fast graph cut algorithm for planar graphs. It is based on the graph theoretical work [2] and leads to an efficient method that we apply on shape matching and image segmentation. In contrast to currently ...
详细信息
ISBN:
(纸本)9781424439928
We present a fast graph cut algorithm for planar graphs. It is based on the graph theoretical work [2] and leads to an efficient method that we apply on shape matching and image segmentation. In contrast to currently used methods in computervision, the presented approach provides an upper bound for its runtime behavior that is almost linear In particular, we are able to match two different planar shapes of N points in O(N-2 log N) and segment a given image of N pixels in O(N log N). We present two experimental benchmark studies which demonstrate that the presented method is also in practice faster than previously proposed graph cut methods: On planar shape matching and image segmentation we observe a speed-up of an order of magnitude, depending on resolution.
Maximum likelihood (ML) estimation is widely used in many computervision problems involving the estimation of geometric parameters, from conic fitting to bundle adjustment for structure and motion, his paper presents...
详细信息
ISBN:
(纸本)9781424439928
Maximum likelihood (ML) estimation is widely used in many computervision problems involving the estimation of geometric parameters, from conic fitting to bundle adjustment for structure and motion, his paper presents a detailed discussion on the bias of ML estimates derived for these problems. Statistical theory states that although ML estimates attain maximum accuracy in the limit as the sample size goes to infinity, they can have non-negligible bias with small sample sizes. In the case of computervision problems, the ML optimality holds when regarding variance in observation errors as the sample size. A natural question is how large the bias will be for a given strength of observation errors. o answer this for a general class of problems, we analyze the mechanism of how the bias of ML estimates emerges, and show that the differential geometric properties of geometric constraints used in the problems determines the magnitude of bias. Based on this result, we present a numerical method of computing bias-corrected estimates.
暂无评论