A novel approach for estimating articulated body posture and motion from monocular video sequences is proposed. Human pose is defined as the instantaneous two dimensional configuration (i.e.,the projection onto the im...
详细信息
A novel approach for estimating articulated body posture and motion from monocular video sequences is proposed. Human pose is defined as the instantaneous two dimensional configuration (i.e.,the projection onto the image plane) of a single articulated body in terms of the position of a predetermined set of joints. First, statistical segmentation of the human bodies from the background is performed and low-level visual features are found given the segmented body shape. The goal is to be able to map these, generally low level, visual features to body configurations. The system estimates different mappings, each one with a specific cluster in the visual feature space. Given a set of body motion sequences for training, unsupervised clustering is obtained via the Expectation Maximization algorithm. For each of the clusters, a function is estimated to build the mapping between low-level features to 2D pose. Given new visual features, a mapping from each cluster is performed to yield a set of possible poses. From this set, the system selects the most likely pose given the learned probability distribution and the visual feature similarity, between hypothesis and input. Performance of the proposed approach is characterized using real and artificially generated body postures, showing promising results.
This paper describes a new method for tracking of a human body in 3D motion by using constraints imposed on the body from the scene. An image-based approach for tracking exclusively uses a geometrical model of the hum...
详细信息
ISBN:
(纸本)0769506623
This paper describes a new method for tracking of a human body in 3D motion by using constraints imposed on the body from the scene. An image-based approach for tracking exclusively uses a geometrical model of the human. body. Since the model usually has a large number of degrees of freedom (DOF), a chance do be corrupted by noise increases during the tracking process, and the tracking may fall in an ill-posed problem. To cope with this problem, we pay our attention, to that a human body can not move freely, and usually receive some constraints from the scene. The new method uses constraints imposed on position, velocity and acceleration of the part of the body from the scene. These constraints can reduce the DOF of the model. This reduction guarantees the tracking problem to be a well-posed problem, and prevents tracking errors by noise. Experiments with real image sequences support a precise tracking of the body.
The engineering of computer vision systems that meet application specific computational and accuracy requirements is crucial to the deployment of real-life computer vision systems. This paper illustrates how past work...
详细信息
The engineering of computer vision systems that meet application specific computational and accuracy requirements is crucial to the deployment of real-life computer vision systems. This paper illustrates how past work on a systematic engineering methodology for vision systems performance characterization can be used to develop a real-time people detection and rooming system to meet given application requirements. We illustrate that by judiciously choosing the system modules and performing a careful analysis of the influence of various tuning parameters on the system it is possible to: perform proper statistical inference, automatically set control parameters and quantify limits of a dual-camera real-time video surveillance system. The goal of the system is to continuously provide a high resolution zoomed-in image of a persons head at any location of the monitored area. An omni-directional camera video is processed to detect people and to precisely control a high resolution foveal camera, which has pan, tilt and zoom capabilities. The pan and tilt parameters of the foveal camera and its uncertainties are shown to be functions of the underlying geometry, lighting conditions, background color/contrast, relative position of the person with respect to both cameras as well as sensor noise and calibration errors. The uncertainty in the estimates is used to adaptively estimate the zoom. parameter that guarantees with a user specified probability, alpha, that the detected person's face is contained and zoomed within the image.
We use cluster analysis as a unifying principle for problems from low, middle and high level vision. The clustering problem is viewed as graph partitioning, where nodes represent data elements and the weights of the e...
详细信息
ISBN:
(纸本)0769506623
We use cluster analysis as a unifying principle for problems from low, middle and high level vision. The clustering problem is viewed as graph partitioning, where nodes represent data elements and the weights of the edges represent pairwise similarities Our algorithm generates samples of cuts in this graph, by using David Karger's contraction algorithm, and computes an "average" cut which provides the basis for our solution to the clustering problem. The stochastic nature of our method makes it robust against noise, including accidental edges and small spurious clusters. The complexity of our algorithm is very low: O(N log(2) N)far N objects and a fixed accuracy level. Without additional computational cost, our algorithm provides a hierarchy of nested partitions. We demonstrate the superiority of our method for image segmentation on a few real color images. Our second application includes the concatenation of edges in a cluttered scene (perceptual grouping), where we show that the same clustering algorithm achieves as goad a grouping, if not better as more specialized methods.
Combining learning with vision techniques in interactive image retrieval has been all active research topic during the past few years. However, existing learning techniques either are based on heuristics or fail to an...
详细信息
ISBN:
(纸本)0769506623
Combining learning with vision techniques in interactive image retrieval has been all active research topic during the past few years. However, existing learning techniques either are based on heuristics or fail to analyze the working conditions. Furthermore, there is almost no in depth study on how to effectively learn from the users when there are multiple visual features in the retrieval system. To address these limitations, bt this paper we present a vigorous optimization formulation of the learning process and solve the problem in a principled way. By using Lagrange multipliers, we have derived explicit solutions, which are both optimal a,tn fast to compute. Extensive comparisons against state-of-the-art techniques have been performed. Experiments were carried out on a large-size heterogeneous image collection consisting of 17,000 images. Retrieval performance was tested under a wide range of conditions. Various evaluation criteria, including precision-recall curve and rank measure, have demonstrated the effectiveness and robustness of the proposed technique.
In this paper we present a, geometric theory for reconstruction of surface models from sparse 3D data captured from N camera views which are consistent with the data visibility. Sparse 3D measurements of real scenes a...
详细信息
In this paper we present a, geometric theory for reconstruction of surface models from sparse 3D data captured from N camera views which are consistent with the data visibility. Sparse 3D measurements of real scenes are readily estimated from image sequences using structure-from-motion techniques. Currently there is no general method for reconstruction of 3D models of arbitrary scenes from sparse data. We introduce an algorithm for recursive integration of sparse all structure to obtain a consistent model. This algorithm is shown to converge to the real scene structure as the number of views increases and to have a. computational cost which is linear in the number of views. Results are presented for real and synthetic image sequences which demonstrate correct reconstruction for scenes containing significant occlusions.
The measurement of object reflectance from color images is carried out. Illumination and geometrical invariant properties are derived from a physical reflectance model based on the Kubelka-Munk theory. Invariance, dis...
详细信息
The measurement of object reflectance from color images is carried out. Illumination and geometrical invariant properties are derived from a physical reflectance model based on the Kubelka-Munk theory. Invariance, discriminative power and localization accuracy of the color invariants are extensively studied. Experiments show the different invariants to be highly discriminative while maintaining invariance properties.
This paper reports a visual system that recognizes 3D shapes of high speed moving objects. In this system, an active camera fixates a point on the moving object at the center of the image and tracks it. In this tracki...
详细信息
ISBN:
(纸本)0780363493
This paper reports a visual system that recognizes 3D shapes of high speed moving objects. In this system, an active camera fixates a point on the moving object at the center of the image and tracks it. In this tracking, other object points than the fixation point move in the image field depending on their relative depths to the fixated point. Based on this fact, we are able to reconstruct 3D object shape from the tracking images. We developed a simple algorithm to realize this reconstruction for real time processing. We utilize the relation between spatial and temporal change of image with respect to the object depths. For our purpose, it is important to obtain object shape quickly even if it will be rough one. We constructed a system that does not need complex and difficult imageprocessing.
In this paper, a novel scheme of extracting multiple illuminant directions from an image of a Lambertian sphere of known size is proposed. The illuminant direction detection process is based on the concept of critical...
详细信息
In this paper, a novel scheme of extracting multiple illuminant directions from an image of a Lambertian sphere of known size is proposed. The illuminant direction detection process is based on the concept of critical points introduced in the paper. We show that the illuminant directions have a close relationship to those critical points and that, by identifying those critical points as many as possible, illuminant directions may be recovered if certain conditions are satisfied. Our preliminary experimental results show that illumination information can be obtained accurately.
We will demonstrate our CVPR2000 paper 'Measurement of Color Invariants' for the cases of image retrieval based on query by example and for color image segmentation. Both are of importance in content based acc...
详细信息
We will demonstrate our CVPR2000 paper 'Measurement of Color Invariants' for the cases of image retrieval based on query by example and for color image segmentation. Both are of importance in content based access of image and video data. We demonstrate the usefulness of the proposed color invariants in image retrieval by example systems. We show that an image retrieval query should include the type of invariance expected in the result. We demonstrate such queries by using the 'imageSurf' retrieval system. Segmentation of images based on the proposed color invariants is demonstrated by the 'PicToVision' system. The system provides imageprocessing.functionality through the world wide web, and is publicly accessible at ***/research/isis/***.
暂无评论