We present an interactive workspace featuring visionbased gesture recognition that allows multiple users to collaborate in the creation of a concept map. The workspace integrates web-based and face-to-face scenarios i...
详细信息
We present an interactive workspace featuring visionbased gesture recognition that allows multiple users to collaborate in the creation of a concept map. The workspace integrates web-based and face-to-face scenarios in knowledge-building activities like brainstorming or problem solving sessions. A wiki serves as the repository for knowledge elements, which are presented to co-located users in form of a concept map visualized on a table. The computervision module tracks multiple hands and fingers on the table surface using fingertip detection and tracking algorithm. The concept map and the wiki are synchronized in real-time, providing notifications to both co-located and distributed users and allowing a community shared awareness that enhances and enriches the knowledge building experience.
This paper proposes a system which can handle illumination problem of face recognition systems by using "Retinex and color constancy" algorithm. The Retinex and color constancy approach has been plugged with...
详细信息
This paper proposes a system which can handle illumination problem of face recognition systems by using "Retinex and color constancy" algorithm. The Retinex and color constancy approach has been plugged with Elastic Bunch Graph Matching (EBGM). The proposed system has been tested on IITK database having more than 1000 face images. The experimental results demonstrate that performance of the proposed system is superior to the known systems. The overall accuracy has shown an increase of 3.14% as compared to the known EBGM based recognition system without using Retinex and Color Constancy method.
We introduce the term cosegmentation which denotes the task of segmenting simultaneously the common parts of an image pair. A generative model for cosegmentation is presented. Inference in the model leads to minimizin...
详细信息
We introduce the term cosegmentation which denotes the task of segmenting simultaneously the common parts of an image pair. A generative model for cosegmentation is presented. Inference in the model leads to minimizing an energy with an MRF term encoding spatial coherency and a global constraint which attempts to match the appearance histograms of the common parts. This energy has not been proposed previously and its optimization is challenging and NP-hard. For this problem a novel optimization scheme which we call trust region graph cuts is presented. We demonstrate that this framework has the potential to improve a wide range of research: Object driven image retrieval, video tracking and segmentation, and interactive image editing. The power of the framework lies in its generality, the common part can be a rigid/non-rigid object (or scene), observed from different viewpoints or even similar objects of the same class.
This paper is about mapping images to continuous output spaces using powerful Bayesian learning techniques. A sparse, semi-supervised Gaussian process regression model (S3GP) is introduced which learns a mapping using...
详细信息
This paper is about mapping images to continuous output spaces using powerful Bayesian learning techniques. A sparse, semi-supervised Gaussian process regression model (S3GP) is introduced which learns a mapping using only partially labelled training data. We show that sparsity bestows efficiency on the S3GP which requires minimal CPU utilization for real-time operation; the predictions of uncertainty made by the S3GP are more accurate than those of other models leading to considerable performance improvements when combined with a probabilistic filter; and the ability to learn from semi-supervised data simplifies the process of collecting training data. The S3GP uses a mixture of different image features: this is also shown to improve the accuracy and consistency of the mapping. A major application of this work is its use as a gaze tracking system in which images of a human eye are mapped to screen coordinates: in this capacity our approach is efficient, accurate and versatile.
作者:
G. PeyreL.D. CohenCMAP
UMR CNRS 7641 École Polytechnique Palaiseau France CEREMADE
UMR CNRS 7534 Université Paris Dauphine Paris France
This paper presents a new method to quickly extract geodesic paths on images and 3D meshes. We use a heuristic to drive the front propagation procedure of the classical Fast Marching. This results in a modification of...
详细信息
This paper presents a new method to quickly extract geodesic paths on images and 3D meshes. We use a heuristic to drive the front propagation procedure of the classical Fast Marching. This results in a modification of the Fast Marching algorithm that is similar to the A algorithm used in artificial intelligence. In order to find very quickly geodesic paths between any given couples of points, we advocate for the initial computation of distance maps to a set of landmark points and make use of these distance maps through a relevant heuristic. We show that our method brings a large speed up for large scale applications that require the extraction of geodesics on images and 3D meshes. We introduce two distortion metrics in order to find an optimal seeding of landmark points for the targeted applications. We also propose a compression scheme to reduce the memory requirement without impacting the quality of the extracted paths.
Existing methods for video completion typically rely on periodic color transitions, layer extraction, or temporally local motion. However, periodicity may be imperceptible or absent, layer extraction is difficult, and...
详细信息
Existing methods for video completion typically rely on periodic color transitions, layer extraction, or temporally local motion. However, periodicity may be imperceptible or absent, layer extraction is difficult, and temporally local motion cannot handle large holes. This paper presents a new approach for video completion using motion field transfer to avoid such problems. Unlike prior methods, we fill in missing video parts by sampling spatio-temporal patches of local motion instead of directly sampling color. Once the local motion field has been computed within the missing parts of the video, color can then be propagated to produce a seamless hole-free video. We have validated our method on many videos spanning a variety of scenes. We can also use the same approach to perform frame interpolation using motion fields from different videos.
Image registration for X-ray dual energy imaging is challenging due to the overlaid transparent layers (i.e., the bone and soft tissue) and the different appearances between the dual images acquired with X-rays at dif...
详细信息
Image registration for X-ray dual energy imaging is challenging due to the overlaid transparent layers (i.e., the bone and soft tissue) and the different appearances between the dual images acquired with X-rays at different energy spectra. Moreover, subpixel accuracy is necessary for good reconstruction of the bone and soft-tissue layers. This paper addresses these problems with a novel coupled Bayesian framework, in which the registration and reconstruction can effectively reinforce each other. With the reconstruction results, we can design accurate matching criteria for aligning the dual images, instead of treating them as multi-modality registration. Furthermore, prior knowledge of the bone and soft tissue can be exploited to detect poor reconstruction due to inaccurate registration; and hence correct registration errors in the coupled framework. A multiscale freeform registration algorithm is implemented to achieve subpixel registration accuracy. Promising results are obtained in the experiments.
Background subtraction is a widely used paradigm to detect moving objects in video taken from a static camera and is used for various important applications such as video surveillance, human motion analysis, etc. Vari...
详细信息
Background subtraction is a widely used paradigm to detect moving objects in video taken from a static camera and is used for various important applications such as video surveillance, human motion analysis, etc. Various statistical approaches have been proposed for modeling a given scene background. However, there is no theoretical framework for choosing which features to use to model different regions of the scene background. In this paper we introduce a novel framework for feature selection for background modeling and subtraction. A boosting algorithm, namely RealBoost, is used to choose the best combination of features at each pixel. Given the probability estimates from a pool of features calculated by Kernel Density Estimate (KDE) over a certain time period, the algorithm selects the most useful ones to discriminate foreground objects from the scene background. The results show that the proposed framework successfully selects appropriate features for different parts of the image.
Recent advances in single-view reconstruction (SVR) have been in modelling power (curved 2.5D surfaces) and automation (automatic photo pop-up). We extend SVR along both of these directions. We increase modelling powe...
详细信息
Recent advances in single-view reconstruction (SVR) have been in modelling power (curved 2.5D surfaces) and automation (automatic photo pop-up). We extend SVR along both of these directions. We increase modelling power in several ways: (i) We represent general 3D surfaces, rather than 2.5D Monge patches; (ii) We describe a closed-form method to reconstruct a smooth surface from its image apparent contour, including multilocal singularities ("kidney-bean" self-occlusions); (iii) We show how to incorporate user-specified data such as surface normals, interpolation and approximation constraints; (iv) We show how this algorithm can be adapted to deal with surfaces of arbitrary genus. We also show how the modelling process can be automated for simple object shapes and views, using a-priori object class information. We demonstrate these advances on natural images drawn from a number of object classes.
Multi-camera tracking systems often must maintain consistent identity labels of the targets across views to recover 3D trajectories and fully take advantage of the additional information available from the multiple se...
详细信息
Multi-camera tracking systems often must maintain consistent identity labels of the targets across views to recover 3D trajectories and fully take advantage of the additional information available from the multiple sensors. Previous approaches to the "correspondence across views" problem include matching features, using camera calibration information, and computing homographies between views under the assumption that the world is planar. However, it can be difficult to match features across significantly different views. Furthermore, calibration information is not always available and planar world hypothesis can be too restrictive. In this paper, a new approach is presented for matching correspondences based on the use of nonlinear manifold learning and system dynamics identification. The proposed approach does not require similar views, calibration nor geometric assumptions of the 3D environment, and is robust to noise and occlusion. Experimental results demonstrate the use of this approach to generate and predict views in cases where identity labels become ambiguous.
暂无评论