We present a novel approach to automatically annotating broadcast video. To manage the enormous variety of objects, events and scenes in video problem domains such as news video, we couple generic image analysis with ...
详细信息
We present a novel approach to automatically annotating broadcast video. To manage the enormous variety of objects, events and scenes in video problem domains such as news video, we couple generic image analysis with a semantic database, WordNet, containing huge amounts of real-world information. Object and event recognition are performed by searching WordNet for concepts jointly supported by image evidence and topic context derived from the video transcript. No object- specific or event-specific training is required, and only a few object models and detection algorithms are required to label much of the significant content of news video. The hierarchical structure of WordNet yields hierarchical recognition, dynamically tailored to the level of supporting image evidence. The potential of the approach is demonstrated by analyzing a wide variety of scenes in news video.
We present a Bayesian approach to image-based visual hull reconstruction. The 3D (three-dimensional) shape of an object of a known class is represented by sets of silhouette views simultaneously observed from multiple...
详细信息
We present a Bayesian approach to image-based visual hull reconstruction. The 3D (three-dimensional) shape of an object of a known class is represented by sets of silhouette views simultaneously observed from multiple cameras. We show how the use of a class-specific prior in a visual hull reconstruction can reduce the effect of segmentation errors from the silhouette extraction process. In our representation, 3D information is implicit in the joint observations of multiple contours from known viewpoints. We model the prior density using a probabilistic principal components analysis-based technique and estimate a maximum a posteriori reconstruction of multi-view contours. The proposed method is applied to a dataset of pedestrian images, and improvements in the approximate 3D models under various noise conditions are shown.
This paper addresses the problem of computing visual hulls from image contours. We propose a new hybrid approach, which overcomes the precision-complexity trade-off inherent to voxel based approaches by taking advanta...
详细信息
This paper addresses the problem of computing visual hulls from image contours. We propose a new hybrid approach, which overcomes the precision-complexity trade-off inherent to voxel based approaches by taking advantage of surface based approaches. To this aim, we introduce a space discretization, which does not rely on a regular grid where most cells are ineffective, but rather on an irregular grid where sample points lie on the surface of the visual hull. Such a grid is composed of tetrahedral cells obtained by applying a Delaunay triangulation on the sample points. These cells are carved afterward according to image silhouette information. The proposed approach keeps the robustness of volumetric approaches while drastically improving their precision and reducing their time and space complexities. It thus allows modeling of objects with complex geometry, and it also makes real time feasible for precise models. Preliminary results with synthetic and real data are presented.
In this paper we compare the performance of interest point descriptors. Many different descriptors have been proposed in the literature. However, it is unclear which descriptors are more appropriate and how their perf...
详细信息
In this paper we compare the performance of interest point descriptors. Many different descriptors have been proposed in the literature. However, it is unclear which descriptors are more appropriate and how their performance depends on the interest point detector. The descriptors should be distinctive and at the same time robust to changes in viewing conditions as well as to errors of the point detector. Our evaluation uses as criterion detection rate with respect to false positive rate and is carried out for different image transformations. We compare SIFT descriptors (Lowe, 1999), steerable filters (Freeman and Adelson, 1991), differential invariants (Koenderink ad van Doorn, 1987), complex filters (Schaffalitzky and Zisserman, 2002), moment invariants (Van Gool et al., 1996) and cross-correlation for different types of interest points. In this evaluation, we observe that the ranking of the descriptors does not depend on the point detector and that SIFT descriptors perform best. Steerable filters come second ; they can be considered a good choice given the low dimensionality.
We present a framework for motion segmentation that combines the concepts of layer-based methods and feature-based motion estimation. We estimate the initial correspondences by comparing vectors of filter outputs at i...
详细信息
We present a framework for motion segmentation that combines the concepts of layer-based methods and feature-based motion estimation. We estimate the initial correspondences by comparing vectors of filter outputs at interest points, from which we compute candidate scene relations via random sampling of minimal subsets of correspondences. We achieve a dense, piecewise smooth assignment of pixels to motion layers using a fast approximate graph-cut algorithm based on a Markov random field formulation. We demonstrate our approach on image pairs containing large inter-frame motion and partial occlusion. The approach is efficient and it successfully segments scenes with inter-frame disparities previously beyond the scope of layer-based motion segmentation methods.
We present a novel algorithm for optimally segmenting dynamic scenes containing multiple rigidly moving objects. We cast the motion segmentation problem as a constrained nonlinear least squares problem, which minimize...
详细信息
We present a novel algorithm for optimally segmenting dynamic scenes containing multiple rigidly moving objects. We cast the motion segmentation problem as a constrained nonlinear least squares problem, which minimizes the reprojection error subject to all multibody epipolar constraints. By converting this constrained problem into an unconstrained one, we obtain an objective function that depends on the motion parameters only (fundamental matrices), but is independent on the segmentation of the image features. Therefore, our algorithm does not iterate between feature segmentation and single body motion estimation. Instead, it uses standard nonlinear optimization techniques to simultaneously recover all the fundamental matrices, without prior segmentation. We test our approach on a real sequence.
An efficient algorithmic solution to the classical five-point relative pose problem is presented. The problem is to find the possible solutions for relative camera motion between two calibrated views given five corres...
详细信息
An efficient algorithmic solution to the classical five-point relative pose problem is presented. The problem is to find the possible solutions for relative camera motion between two calibrated views given five corresponding points. The algorithm consists of computing the coefficients of a tenth degree polynomial and subsequently finding its roots. It is the first algorithm well suited for numerical implementation that also corresponds to the inherent complexity of the problem. The algorithm is used in a robust hypothesis-and-test framework to estimate structure and motion in real-time.
We propose a novel and efficient approach for active unsupervised texture segmentation. First, we show how we can extract a small set of good features for texture segmentation based on the structure tensor and nonline...
详细信息
We propose a novel and efficient approach for active unsupervised texture segmentation. First, we show how we can extract a small set of good features for texture segmentation based on the structure tensor and nonlinear diffusion. Then, we propose a variational framework that incorporates these features in a level set based unsupervised segmentation process that adaptively takes into account their estimated statistical information inside and outside the region to segment. The approach has been tested on various textured images, and its performance is favorably compared to recent studies.
This paper presents a novel background subtraction method for detecting foreground objects in dynamic scenes involving swaying trees and fluttering flags. Most methods proposed so far adjust the permissible range of t...
详细信息
This paper presents a novel background subtraction method for detecting foreground objects in dynamic scenes involving swaying trees and fluttering flags. Most methods proposed so far adjust the permissible range of the background image variations according to the training samples of background images. Thus, the detection sensitivity decreases at those pixels having wide permissible ranges. If we can narrow the ranges by analyzing input images, the detection sensitivity can be improved. For this narrowing, we employ the property that image variations at neighboring image blocks have strong correlation, also known as "cooccurrence". This approach is essentially different from chronological background image updating or morphological postprocessing. Experimental results for real images demonstrate the effectiveness of our method.
Automatic classification of an image as a photograph of a real-scene or as a painting is potentially useful for image retrieval and Web site filtering applications. The main contribution of the paper is the propositio...
详细信息
Automatic classification of an image as a photograph of a real-scene or as a painting is potentially useful for image retrieval and Web site filtering applications. The main contribution of the paper is the proposition of several features derived from the color, edge, and gray-scale-texture information of the image that effectively discriminate paintings from photographs. For example, we found that paintings contain significantly more pure-color edges, and that certain gray-scale-texture measurements (mean and variance of Gabor filters) are larger for photographs. Using a large set of images (12000) collected from different Web sites, the proposed features exhibit very promising classification performance (over 90%). A comparative analysis of the automatic classification results and psychophysical data is reported, suggesting that the proposed automatic classifier estimates the perceptual photorealism of a given picture.
暂无评论