The complexity of human detection increases significantly with a growing density of humans populating a scene. This paper presents a Bayesian detection framework using shape and motion cues to obtain a maximum a poste...
详细信息
ISBN:
(纸本)9781424439928
The complexity of human detection increases significantly with a growing density of humans populating a scene. This paper presents a Bayesian detection framework using shape and motion cues to obtain a maximum a posteriori (MAP) solution for human configurations consisting of many, possibly occluded pedestrians viewed by a stationary camera. The paper contains two novel contributions for the human detection task: 1. computationally efficient detection based on shape templates using contour integration by means of integral images which are built by oriented string scans;(2) a non-parametric approach using an approximated version of the Shape Context descriptor which generates informative object parts and infers the presence of humans despite occlusions. The outputs of the two detectors are used to generate a spatial configuration of hypothesized human body locations. The configuration is iteratively optimized while taking into account the depth ordering and occlusion status of the hypotheses. The method achieves fast computation times even in complex scenarios with a high density of people. Its validity is demonstrated on a substantial amount of image data using the CAVIAR and our own datasets. Evaluation results and comparison with state of the art are presented.
We introduce a novel family of contextual measures of similarity between distributions: the similarity between two distributions q and p is measured in the context of a third distribution u. In our framework any tradi...
详细信息
ISBN:
(纸本)9781424439928
We introduce a novel family of contextual measures of similarity between distributions: the similarity between two distributions q and p is measured in the context of a third distribution u. In our framework any traditional measure of similarity / dissimilarity has its contextual counterpart. We show that for two important families of divergences (Bregman and Csiszar), the contextual similarity computation consists in solving a convex optimization problem. We focus on the case of multinomials and explain how to compute in practice the similarity for several well-known measures. These contextual measures are then applied to the image retrieval problem. In such a case, the context u is estimated from the neighbors of a query q. One of the main benefits of our approach lies in the fact that using different contexts, and especially contexts at multiple scales (i.e. broad and narrow contexts), provides different views on the same problem. Combining the different views can improve retrieval accuracy. We will show on two very different datasets (one of photographs, the other of document images) that the proposed measures have a relatively small positive impact on macro Average Precision (which measures purely ranking) and a large positive impact on micro Average Precision (which measures both ranking and consistency of the scores across multiple queries).
This paper addresses the challenging issue of target tracking and appearance learning in Forward Looking Infrared (FLIR) sequences. Tracking and appearance learning are formulated as a joint state estimation problem w...
详细信息
High angular resolution diffusion imaging has become an important magnetic resonance technique for in vivo imaging. Most current research in this field focuses on developing methods for computing the orientation distr...
详细信息
ISBN:
(纸本)9781424439928
High angular resolution diffusion imaging has become an important magnetic resonance technique for in vivo imaging. Most current research in this field focuses on developing methods for computing the orientation distribution function (ODF), which is the probability distribution function of water molecule diffusion along any angle on the sphere. In this paper, we present a Riemannian framework to carry out computations on an ODF field. The proposed framework does not require that the ODFs be represented by any fixed parameterization, such as a mixture of von Mises-Fisher distributions or a spherical harmonic expansion. Instead, we use a non-parametric representation of the ODF, and exploit the fact that under the square-root re-parameterization, the space of ODFs forms a Riemannian manifold, namely the unit Hilbert sphere. Specifically, we use Riemannian operations to perform various geometric data processing algorithms, such as interpolation, convolution and linear and nonlinear filtering. We illustrate these concepts with numerical experiments on synthetic and real datasets.
We present a system for fast model-based segmentation and 3D pose estimation of specular objects using appearance based specular features. We use observed (a) specular reflection and (b) specular flow as cues, which a...
详细信息
ISBN:
(纸本)9781424439928
We present a system for fast model-based segmentation and 3D pose estimation of specular objects using appearance based specular features. We use observed (a) specular reflection and (b) specular flow as cues, which are matched against similar cues generated from a CAD model of the object in various poses. We avoid estimating 3D geometry or depths, which is difficult and unreliable for specular scenes. In the first method, the environment map of the scene is utilized to generate a database containing synthesized specular reflections of the object for densely sampled 3D poses. This database is compared with captured images of the scene at run time to locate and estimate the 3D pose of the object. In the second method, specular flows are generated for dense 3D poses as illumination invariant features and are matched to the specular flow of the scene. We incorporate several practical heuristics such as use of saturated/highlight pixels for fast matching and normal selection to minimize the effects of inter-reflections and cluttered backgrounds. Despite its simplicity, our approach is effective in scenes with multiple specular objects, partial occlusions, inter-reflections, cluttered backgrounds and changes in ambient illumination. Experimental results demonstrate the effectiveness of our method for various synthetic and real objects.
Dominance is referred to the level of influence a person has in a conversation. Dominance is an important research area in social psychology, but the problem of its automatic estimation is a very recent topic in the c...
详细信息
In the past few years, lots of works were achieved on Simultaneous Localization and Mapping (SLAM). It is now possible to follow in real time the trajectory of a moving camera in an unknown environment. However, curre...
详细信息
ISBN:
(纸本)9781424439928
In the past few years, lots of works were achieved on Simultaneous Localization and Mapping (SLAM). It is now possible to follow in real time the trajectory of a moving camera in an unknown environment. However, current SLAM methods are still prone to drift errors, which prevent their use in large-scale applications. In this paper, we propose a solution to reduce those errors a posteriori. Our solution is based on a post-processing algorithm that exploits additional geometric constraints, relative to the environment, to correct both the reconstructed geometry and the camera trajectory. These geometric constraints are obtained through a coarse 3D modelisation of the environment, similar to those provided by GIS database. First, we propose an original articulated transformation model in order to roughly align the SLAM reconstruction with this 3D model through a non-rigid ICP step. Then, to refine the reconstruction, we introduce a new bundle adjustment cost function that includes, in a single term, the usual 3D point/2D observation consistency constraint as well as the geometric constraints provided by the 3D model. Results on large-scale synthetic and real sequences show that our method successfully improves SLAM reconstructions. Besides, experiments prove that the resulting reconstruction is accurate enough to be directly used for global relocalization applications.
Activity analysis is a basic task in video surveillance and has become an active research area. However, due to the diversity of moving objects' category and their motion patterns, developing robust semantic scene...
详细信息
ISBN:
(纸本)9781424439928
Activity analysis is a basic task in video surveillance and has become an active research area. However, due to the diversity of moving objects' category and their motion patterns, developing robust semantic scene models for activity analysis remains a challenging problem in traffic scenarios. This paper proposes a novel framework to learn semantic scene models. In this framework, the detected moving objects are first classified as pedestrians or vehicles via a co-trained classifier which takes advantage of the multiview information of objects. As a result, the framework can automatically learn motion patterns respectively for pedestrians and vehicles. Then, a graph is proposed to learn and cluster the motion patterns. To this end, trajectory is parameterized and the image is cut into multiple blocks which are taken as the nodes in the graph. Based on the parameters of trajectories, the primary motion patterns in each node (block) are extracted via Gaussian Mixture Model (GMM), and supplied to this graph. The graph-cut algorithm is finally employed to group the motion patterns together, and trajectories are clustered to learn semantic scene models. Experimental results and applications to real-world scenes show the validity of our proposed method.
Feature-based methods have recently gained popularity in computervision and patternrecognition communities, in applications such as object recognition and image retrieval. In this paper, we explore analogous approac...
详细信息
Extracting endocardium and epicardium from echocardiographic images is a challenging task because of large amounts of noise, signal drop-out, unrelated structures, and unseen wall parts. This paper introduces a new te...
详细信息
暂无评论