We present a new method for synthesizing novel views of a 3D scene from few model images in full correspondence. The core of this work is the derivation of a tensorial operator that describes the transformation from) ...
详细信息
ISBN:
(纸本)0780342364
We present a new method for synthesizing novel views of a 3D scene from few model images in full correspondence. The core of this work is the derivation of a tensorial operator that describes the transformation from) a given tensor of three views to a novel tensor of a new configuration of three views. BL repeated application of the operator an a seed tensor with a sequence of desired virtual camera positions we obtain a chain of warping functions (tensors) from the set of model images to create the desired virtual views.
We develop a simple and very fast method for object tracking based exclusively on color information in digitized video images. Running on a Silicon Graphics R4600 Indy system with an IndyCam, our algorithm is capable ...
详细信息
ISBN:
(纸本)0780342364
We develop a simple and very fast method for object tracking based exclusively on color information in digitized video images. Running on a Silicon Graphics R4600 Indy system with an IndyCam, our algorithm is capable of simultaneously tracking objects at full frame size (640 x 480 pixels) and video frame rate (30 fps). Robustness with respect to occlusion is achieved via an explicit hypothesis-tree model of the occlusion process. We demonstrate the efficacy of our technique in the challenging task of tracking people, especially tracking human heads and hands.
We consider the problem of learning to map between two vector spaces given pairs of matching vectors, one from each space. This problem naturally arises in numerous vision problems, for example, when mapping between t...
详细信息
ISBN:
(纸本)9781424469840
We consider the problem of learning to map between two vector spaces given pairs of matching vectors, one from each space. This problem naturally arises in numerous vision problems, for example, when mapping between the images of two cameras, or when the annotations of each image is multidimensional. We focus on the common asymmetric case, where one vector space X is more informative than the other Y, and find a transformation from Y to X. We present a new optimization problem that aims to replicate in the transformed Y the margins that dominate the structure of X. This optimization problem is convex, and efficient algorithms are presented. Links to various existing methods such as CCA and SVM are drawn, and the effectiveness of the method is demonstrated in several visual domains.
We derive a sensitivity analysis for moment invariants of multidimensional distributions, These invariants have many uses in computational systems and have recently been used for illumination-invariant recognition in ...
详细信息
ISBN:
(纸本)0780342364
We derive a sensitivity analysis for moment invariants of multidimensional distributions, These invariants have many uses in computational systems and have recently been used for illumination-invariant recognition in color images. In this context, the sensitivity analysis predicts the response of moment invariants to partial occlusion. Using the results of the sensitivity analysis, we develop a novel surface representation called the invariant profile which captures color distribution and spatial information while remaining invariant to the spectral content of the scene illumination. Unlike previous representations, the recognition of invariant profiles does not require illumination correction. We demonstrate the sensitivity analysis and the use of invariant profiles for recognition with a set of experiments on color images.
Self-similarity is an attractive image property which has recently found its way into object recognition in the form of local self-similarity descriptors [5, 6, 14, 18, 23, 27] In this paper we explore global self-sim...
详细信息
ISBN:
(纸本)9781424469840
Self-similarity is an attractive image property which has recently found its way into object recognition in the form of local self-similarity descriptors [5, 6, 14, 18, 23, 27] In this paper we explore global self-similarity (GSS) and its advantages over local self-similarity (LSS). We make three contributions: (a) we propose computationally efficient algorithms to extract GSS descriptors for classification. These capture the spatial arrangements of self-similarities within the entire image;(b) we show how to use these descriptors efficiently for detection in a sliding-window framework and in a branch-and-bound framework;(c) we experimentally demonstrate on Pascal VOC 2007 and on ETHZ Shape Classes that GSS outperforms LSS for both classification and detection, and that GSS descriptors are complementary to conventional descriptors such as gradients or color.
We are interested in identifying the material category, e.g. glass, metal, fabric, plastic or wood, from a single image of a surface. Unlike other visual recognition tasks in computervision, it is difficult to find g...
详细信息
ISBN:
(纸本)9781424469840
We are interested in identifying the material category, e.g. glass, metal, fabric, plastic or wood, from a single image of a surface. Unlike other visual recognition tasks in computervision, it is difficult to find good, reliable features that can tell material categories apart. Our strategy is to use a rich set of low and mid-level features that capture various aspects of material appearance. We propose an augmented Latent Dirichlet Allocation (aLDA) model to combine these features under a Bayesian generative framework and learn an optimal combination of features. Experimental results show that our system performs material recognition reasonably well on a challenging material database, outperforming state-of-the-art material/texture recognition systems.
We propose a new method for view synthesis from real images using stereo vision. The method does not explicitly model scene geometry, and enables fast and exact generation of synthetic views. We also reevaluate the re...
详细信息
ISBN:
(纸本)0818672587
We propose a new method for view synthesis from real images using stereo vision. The method does not explicitly model scene geometry, and enables fast and exact generation of synthetic views. We also reevaluate the requirements on stereo algorithms for the application of view synthesis and discuss ways of dealing with partially occluded regions of unknown depth and with completely occluded regions of unknown texture. Our experiments demonstrate that it is possible to efficiently synthesize realistic new views even from inaccurate and incomplete depth information.
We present a method that unifies tracking and video content recognition with applications to Mobile Augmented Reality (MAR). We introduce the Radial Gradient Transform (RGT) and an approximate RGT, yielding the Rotati...
详细信息
ISBN:
(纸本)9781424469840
We present a method that unifies tracking and video content recognition with applications to Mobile Augmented Reality (MAR). We introduce the Radial Gradient Transform (RGT) and an approximate RGT, yielding the Rotation-Invariant, Fast Feature (RIFF) descriptor. We demonstrate that RIFF is fast enough for real-time tracking, while robust enough for large scale retrieval tasks. At 26x the speed, our tracking-scheme obtains a more accurate global affine motion-model than the Kanade Lucas Tomasi (KLT) tracker. The same descriptors can achieve 94% retrieval accuracy from a database of 10(4) images.
Food recognition is difficult because food items are deformable objects that exhibit significant variations in appearance. We believe the key to recognizing food is to exploit the spatial relationships between differe...
详细信息
ISBN:
(纸本)9781424469840
Food recognition is difficult because food items are deformable objects that exhibit significant variations in appearance. We believe the key to recognizing food is to exploit the spatial relationships between different ingredients (such as meat and bread in a sandwich). We propose a new representation for food items that calculates pairwise statistics between local features computed over a soft pixel-level segmentation of the image into eight ingredient types. We accumulate these statistics in a multi-dimensional histogram, which is then used as a feature vector for a discriminative classifier. Our experiments show that the proposed representation is significantly more accurate at identifying food than existing methods.
The Perseus system is a purposive visual architecture that has been used to recognize the pointing gesture. recognition of this gesture is an important part of natural human-machine interfaces. Perseus is modularized ...
详细信息
ISBN:
(纸本)0818672587
The Perseus system is a purposive visual architecture that has been used to recognize the pointing gesture. recognition of this gesture is an important part of natural human-machine interfaces. Perseus is modularized into 6 types of components: feature maps, object representations, markers, visual routines, a segmentation map, and a long term visual memory. This structure not only allows Perseus to use knowledge about the task and environment at every stage of processing to more efficiently and accurately solve the pointing task, but also allows it to be extended to tasks other than recognizing pointing.
暂无评论