A new and fast way to find local image correspondences for wide baseline image matching is described. The targeted application is visual navigation, e.g. of a semi-automatic wheelchair. Such applications pose some add...
详细信息
ISBN:
(纸本)0769521584
A new and fast way to find local image correspondences for wide baseline image matching is described. The targeted application is visual navigation, e.g. of a semi-automatic wheelchair. Such applications pose some additional requirements, like the need to work with natural landmarks rather than artificial markers, and the need to recognize locations fast. The restricted motion of the camera can be exploited to simplify the feature extraction. These features should support their identification from different, but nevertheless restricted viewing directions, and under variable illumination conditions. The paper proposes a specialization of so-called affine invariant regions for these particular conditions, which in this case simplifies to column segments. Their applicability is wider than robot navigation, and includes localization for wearable computing and scene recognition for automatic movie indexing.
image composition (or mosaicing) has attracted a growing attention in recent years as one of the main elements in video analysis and representation. In this paper we deal with the problem of global alignment and super...
详细信息
ISBN:
(纸本)0769521584
image composition (or mosaicing) has attracted a growing attention in recent years as one of the main elements in video analysis and representation. In this paper we deal with the problem of global alignment and super-resolution. We also propose to evaluate the quality of the resulting mosaic by measuring the amount of blurring. Global registration is achieved by combining a graph-based technique that exploits the topological structure of the sequence induced by the spatial overlap - with a bundle adjustment which uses only the homographies computed in the previous steps. Experimental comparison with other techniques shows the effectiveness of our approach.
In this paper, we describe an approach to recognizing location from mobile devices using image-based web search. We demonstrate the usefulness of common image search metrics applied on images captured with a camera-eq...
详细信息
ISBN:
(纸本)0769521584
In this paper, we describe an approach to recognizing location from mobile devices using image-based web search. We demonstrate the usefulness of common image search metrics applied on images captured with a camera-equipped mobile device to find matching images on the World Wide Web or other general-purpose databases. Searching the entire web can be computationally overwhelming, so we devise a hybrid image-and-keyword searching technique. First, image-search is performed over images and links to their source web pages in a database that indexes only a small fraction of the web. Then, relevant keywords on these web pages are automatically identified and submitted to an existing text-based search engine (e.g. Google) that indexes a much larger portion of the web. Finally, the resulting image set is filtered to retain images close to the original query. It is thus possible to efficiently search hundreds of millions of images that are not only textually related but also visually relevant. We demonstrate our approach on an application allowing users to browse web pages matching the image of a nearby location.
We present a simple and universal camera calibration method. Instead of extensive setups we are exploiting the accurate angular positions of fixed stars. High precision is achieved by compensating the interfering erro...
详细信息
ISBN:
(纸本)0769521584
We present a simple and universal camera calibration method. Instead of extensive setups we are exploiting the accurate angular positions of fixed stars. High precision is achieved by compensating the interfering error sources. Our approach uses a star catalog and requires a single input image only. No additional user input information such as focal length, exposure date or position is required. Fully automatic processing.and fast convergence is achieved by performing three consecutive steps. First, a star segmentation and centroid finding algorithm extracts the sub-pixel positions of the luminaries. Second, an initial solution for the most essential parameters is determined by combinatorial analysis. Finally, the Levenberg-Marquardt algorithm is applied to solve the resulting non-linear system. Experimental results with several digital consumer cameras demonstrate high robustness and accuracy. The introduced method is advisable for applications where large calibration targets are required.
Relevance feedback (RF) is an important tool to improve the performance of content-based image retrieval system. Support vector machine (SVM based RF is popular because it can generalize better than most other classif...
详细信息
ISBN:
(纸本)0769521584
Relevance feedback (RF) is an important tool to improve the performance of content-based image retrieval system. Support vector machine (SVM based RF is popular because it can generalize better than most other classifiers. However, directly using SVM in RF may not be appropriate, since SVM treats the positive and negative feedbacks equally. Given the different properties of positive samples and negative samples in RF, they should be treated differently. Considering this, we propose an orthogonal complement components analysis (OCCA) combined with SVM in this paper. We then generalize the OCCA to Hilbert space and define the kernel empirical OCCA (KEOCCA). Through experiments on a Corel Photo database with 17,800 images, we demonstrate that the proposed method can significantly improve the performance of conventional SVM-based RF.
Photometric methods in computer vision require calibration of the camera's radiometric response, and previous works have addressed this problem using multiple registered images captured under different camera expo...
详细信息
Photometric methods in computer vision require calibration of the camera's radiometric response, and previous works have addressed this problem using multiple registered images captured under different camera exposure settings. In many instances, such an image set is not available, so we propose a method that performs radiometric calibration from only a single image, based on measured RGB distributions at color edges. This technique automatically selects appropriate edge information for processing. and employs a Bayesian approach to compute the calibration. Extensive experimentation has shown that accurate calibration results can be obtained using only a single input image.
Numerical methods associated with graph-theoretic imageprocessing.algorithms often reduce to the solution of a large linear system. We show here that choosing a topology that yields a small graph diameter can greatly...
详细信息
Numerical methods associated with graph-theoretic imageprocessing.algorithms often reduce to the solution of a large linear system. We show here that choosing a topology that yields a small graph diameter can greatly speed up the numerical solution. As a proof of concept, we examine two image graphs that preserve local connectivity of the nodes (pixels) while drastically reducing the graph diameter. The first is based on a "small-world" modification of a standard 4-connected lattice. The second is based on a quadtree graph. Using a recently described graph-theoretic imageprocessing.algorithm we show that large speedup is achieved with a minimal perturbation of the solution when these graph topologies are utilized. We suggest that a variety of similar algorithms may also benefit from this approach.
We propose a geometric approach to 3-D motion segmentation from point correspondences in three perspective views. We demonstrate that after applying a polynomial embedding to the correspondences they become related by...
详细信息
ISBN:
(纸本)0769521584
We propose a geometric approach to 3-D motion segmentation from point correspondences in three perspective views. We demonstrate that after applying a polynomial embedding to the correspondences they become related by the so-called multibody trilinear constraint and its associated multibody trifocal tensor We show how to linearly estimate the multibody trifocal tensor from point-point-point correspondences. We then show that one can estimate the epipolar lines associated with each image point from the common root of a set of univariate polynomials and the epipoles by solving a plane clustering problem in R-3 using GPCA. The individual trifocal tensors are then obtained from the second order derivatives of the multibody trilinear constraint. Given epipolar lines and epipoles, or trifocal tensors, we obtain an initial clustering of the correspondences, which we use to initialize an iterative algorithm that finds an optimal estimate for the trifocal tensors and the clustering of the correspondences using Expectation Maximization. We test our algorithm on real and synthetic dynamic scenes.
In this paper we describe a method for skeletonization of gray-scale images without segmentation. Our method is based on anisotropic vector diffusion. The skeleton strength map, calculated from the diffused vector fie...
详细信息
In this paper we describe a method for skeletonization of gray-scale images without segmentation. Our method is based on anisotropic vector diffusion. The skeleton strength map, calculated from the diffused vector field, provides us a measure of how possible each pixel could be on the skeletons. The final skeletons are traced from the skeleton strength map, which mimics the behavior of edge detection from the edge strength map of the original image. A couple of real or synthesized images will be shown to demonstrate the performance of our algorithm.
image retrieval critically relies on the distance function used to compare a query image to images in the database. We suggest to learn such distance functions by training binary classifiers with margins, where the cl...
详细信息
ISBN:
(纸本)0769521584
image retrieval critically relies on the distance function used to compare a query image to images in the database. We suggest to learn such distance functions by training binary classifiers with margins, where the classifiers are defined over the product space of pairs of images. The classifiers are trained to distinguish between pairs in which the images are from the same class and pairs which contain images from different classes. The signed margin is used as a distance function. We explore several variants of this idea, based on using SVM and Boosting algorithms as product space classifiers. Our main contribution is a distance learning method which combines boosting hypotheses over the product space with a weak learner based on partitioning the original feature space. The weak learner used is a Gaussian mixture model computed using a constrained EM algorithm, where the constraints are equivalence constraints on pairs of data points. This approach allows us to incorporate unlabeled data into the training process. Using some benchmark databases from the UCI repository, we show that our margin based methods significantly outperform existing metric learning methods, which are based on learning a Mahalanobis distance. We then show comparative results of image retrieval in a distributed learning paradigm, using two databases: a large database offacial images (YaleB), and a database of natural images taken from a commerrial CD. In both cases our GMM based boosting method outperforms all other methods, and its generalization to unseen classes is superior.
暂无评论