Both example-based and model-based approaches for classifying contour shapes can encounter difficulties when dealing with classes that have large nonlinear variability, especially when the variability is structural or...
详细信息
Both example-based and model-based approaches for classifying contour shapes can encounter difficulties when dealing with classes that have large nonlinear variability, especially when the variability is structural or due to articulation. This paper proposes a part-based approach to address this problem. Bayesian classification is performed within a three-level framework, which consists of models for contour segments, for classes, and for the entire database of training examples. The class model enables different parts of different exemplars of a class to contribute to the recognition of an input shape. The method is robust to occlusion and is invariant to planar rotation, translation, and scaling. Furthermore, the method is completely automated. It achieves 98% classification accuracy on a large database with many classes.
A novel framework for anomaly detection in crowded scenes is presented. Three properties are identified as important for the design of a localized video representation suitable for anomaly detection in such scenes: (1...
详细信息
A novel framework for anomaly detection in crowded scenes is presented. Three properties are identified as important for the design of a localized video representation suitable for anomaly detection in such scenes: (1) joint modeling of appearance and dynamics of the scene, and the abilities to detect (2) temporal, and (3) spatial abnormalities. The model for normal crowd behavior is based on mixtures of dynamic textures and outliers under this model are labeled as anomalies. Temporal anomalies are equated to events of low-probability, while spatial anomalies are handled using discriminant saliency. An experimental evaluation is conducted with a new dataset of crowded scenes, composed of 100 video sequences and five well defined abnormality categories. The proposed representation is shown to outperform various state of the art anomaly detection techniques.
We propose a data-driven, hierarchical approach for the analysis of human actions in visual scenes. In particular, we focus on the task of in-house assisted living. In such scenarios the environment and the setting ma...
详细信息
We propose a data-driven, hierarchical approach for the analysis of human actions in visual scenes. In particular, we focus on the task of in-house assisted living. In such scenarios the environment and the setting may vary considerably which limits the performance of methods with pre-trained models. Therefore our model of normality is established in a completely unsupervised manner and is updated automatically for scene-specific adaptation. The hierarchical representation on both an appearance and an action level paves the way for semantic interpretation. Furthermore we show that the model is suitable for coupled tracking and abnormality detection on different hierarchical stages. As the experiments show, our approach, simple yet effective, yields stable results, e.g. the detection of a fall, without any human interaction.
We introduce a novel open-source framework for analyzing and exploring point cloud datasets and algorithms. This is done by integrating the Point Cloud Library (PCL) within ParaView, a parallel scientific visualizatio...
详细信息
We introduce a novel open-source framework for analyzing and exploring point cloud datasets and algorithms. This is done by integrating the Point Cloud Library (PCL) within ParaView, a parallel scientific visualization tool. In particular, we demonstrate that by wrapping PCL algorithms as VTK 1 filters, we can leverage PCL's functionality in an interactive, easy-to-use manner within ParaView. The proposed approach enables rapid algorithm development in a coherent framework without the need to write custom visualization code. We illustrate the advantages of the framework with usage examples such as segmentation, data annotation and Python integration. Additionally, we build upon ParaView's inherent parallelization capabilities and present two strong scaling experiments that demonstrate near-linear scaling performance gains in a multi-processor setup.
Summary form only given. Learning hierarchical representations of object structure in a bottom-up manner faces several difficult issues. First, we are dealing with a very large number of potential feature aggregations...
详细信息
Summary form only given. Learning hierarchical representations of object structure in a bottom-up manner faces several difficult issues. First, we are dealing with a very large number of potential feature aggregations. Furthermore, the set of features the algorithm learns at each layer directly influences the expressiveness of the compositional layers that work on top of them. However, we cannot ensure the usefulness of a particular local feature for object class representation based solely on the local statistics. This can only be done when more global, object-wise information is taken into account. We build on the hierarchical compositional approach (Fidler and Leonardis, 2007) that learns a hierarchy of contour compositions of increasing complexity and specificity. Each composition models spatial relations between its constituent parts.
Direct use of the hand as an input device is an attractive method for providing natural human-computer interaction (HCI). Currently, the only technology that satisfies the advanced requirements of hand-based input for...
详细信息
Direct use of the hand as an input device is an attractive method for providing natural human-computer interaction (HCI). Currently, the only technology that satisfies the advanced requirements of hand-based input for HCI is glovebased sensing. This technology, however, has several drawbacks including that it hinders the ease and naturalness with which the user can interact with the computer controlled environment, and it requires long calibration and setup procedures. computervision has the potential to provide much more natural, non-contact solutions. As a result, there have been considerable research efforts to use the hand as an input device for HCI. A very challenging problem in this context, which is the focus of this review, is recovering the 3D pose of the hand and the fingers as glove-based devices do. This paper presents a brief literature review on full degreeof- freedom (DOF) hand motion estimation methods.
Conventional video cameras have limited fields of view that make them restrictive in a variety of vision applications. There are several ways to enhance the field of view of an imaging system. However, the entire imag...
详细信息
Conventional video cameras have limited fields of view that make them restrictive in a variety of vision applications. There are several ways to enhance the field of view of an imaging system. However, the entire imaging system must have a single effective viewpoint to enable the generation of pure perspective images from a sensed image. A new camera with a hemispherical field of view is presented. Two such cameras can be placed back-to-back, without violating the single viewpoint constraint, to arrive at a truly omnidirectional sensor. Results are presented on the software generation of pure perspective images from an omnidirectional image, given any user-selected viewing direction and magnification. The paper concludes with a discussion on the spatial resolution of the proposed camera.
We study the problem of pointwise motion tracking in echocardiographic images. We show that decorrelation between tissue motion and intensity variation is inevitable for certain kinds of tissue motion and decorrelatio...
详细信息
We study the problem of pointwise motion tracking in echocardiographic images. We show that decorrelation between tissue motion and intensity variation is inevitable for certain kinds of tissue motion and decorrelation compensation is an ill-posed inverse problem if the decorrelation is beyond a certain correlation threshold. We compare the performance of different features using simulations and phantom examples. We find a threshold value of correlation coefficients below which the B-Mode signal works better than the radio frequency (RF) signal in the analysis of large deformation. We also demonstrate that the introduction of a quantitative reliability measure helps to improve the robustness of displacement estimation.
The virtual white cane is a range sensing device based on active triangulation, that can measure distances at a rate of 15 measurements/second. A blind person can use this device for sensing the environment, pointing ...
详细信息
The virtual white cane is a range sensing device based on active triangulation, that can measure distances at a rate of 15 measurements/second. A blind person can use this device for sensing the environment, pointing it as if it was a flashlight. Beside measuring distances, this device can detect surface discontinuities, such as the foot of a wall, a step, or a drop-off. This is obtained by analyzing the range data collected as the user swings the device around, tracking planar patches and finding discontinuities. In this paper we briefly describe the range sensing device, and present an online surface tracking algorithm, based on a Jump-Markov model. We show experimental results proving the robustness of the tracking system in real-world conditions.
Although motion analysis has been extensively investigated in the literature and a wide variety of tracking algorithms have been proposed, the problem of tracking objects using the Dynamic vision Sensor requires a sli...
详细信息
Although motion analysis has been extensively investigated in the literature and a wide variety of tracking algorithms have been proposed, the problem of tracking objects using the Dynamic vision Sensor requires a slightly different approach. Dynamic vision Sensors are biologically inspired vision systems that asynchronously generate events upon relative light intensity changes. Unlike conventional vision systems, the output of such sensor is not an image (frame) but an address events stream. Therefore, most of the conventional tracking algorithms are not appropriate for the DVS data processing. In this paper, we introduce algorithm for spatiotemporal tracking that is suitable for Dynamic vision Sensor. In particular, we address the problem of multiple persons tracking in the occurrence of high occlusions. We investigate the possibility to apply Gaussian Mixture Models for detection, description and tracking objects. Preliminary results prove that our approach can successfully track people even when their trajectories are intersecting.
暂无评论