We present Deep Global Registration, a differentiable framework for pairwise registration of real-world 3D scans. Deep global registration is based on three modules: a 6-dimensional convolutional network for correspon...
详细信息
ISBN:
(纸本)9781728171685
We present Deep Global Registration, a differentiable framework for pairwise registration of real-world 3D scans. Deep global registration is based on three modules: a 6-dimensional convolutional network for correspondence confidence prediction, a differentiable Weighted Procrustes algorithm for closed-form pose estimation, and a robust gradient-based SE(3) optimizer for pose refinement. Experiments demonstrate that our approach outperforms state-of-the-art methods, both learning-based and classical, on real-world data.
We present a novel image operator that seeks to find the value of stroke width for each image pixel, and demonstrate its use on the task of text detection in natural images. The suggested operator is local and data de...
详细信息
ISBN:
(纸本)9781424469840
We present a novel image operator that seeks to find the value of stroke width for each image pixel, and demonstrate its use on the task of text detection in natural images. The suggested operator is local and data dependent, which makes it fast and robust enough to eliminate the need for multi-scale computation or scanning windows. Extensive testing shows that the suggested scheme outperforms the latest published algorithms. Its simplicity allows the algorithm to detect texts in many fonts and languages.
One of the central problems in stereo matching (and other image registration tasks) is the selection of optimal window sizes for comparing image regions. This paper addresses this problem with some novel algorithms ba...
详细信息
ISBN:
(纸本)0818672587
One of the central problems in stereo matching (and other image registration tasks) is the selection of optimal window sizes for comparing image regions. This paper addresses this problem with some novel algorithms based on iteratively diffusing support at different disparity hypotheses, and locally controlling the amount of diffusion based on the current quality of the disparity estimate. It also develops a novel Bayesian estimation technique which significantly outperforms techniques based on area-based matching (SSD) and regular diffusion. We provide experimental results on both synthetic and real stereo image pairs.
Given a stereo pair it is possible to recover a depth map and use that depth to render a synthetically defocused image. Though stereo algorithms are well-studied, rarely are those algorithms considered solely in the c...
详细信息
ISBN:
(纸本)9781467369640
Given a stereo pair it is possible to recover a depth map and use that depth to render a synthetically defocused image. Though stereo algorithms are well-studied, rarely are those algorithms considered solely in the context of producing these defocused renderings. In this paper we present a technique for efficiently producing disparity maps using a novel optimization framework in which inference is performed in "bilateral-space". Our approach produces higher-quality "defocus" results than other stereo algorithms while also being 10-100 x faster than comparable techniques.
We present several methods for the estimation of relative pose between planes and cameras, based on projections of sets of coplanar features in images. While such methods exist for simple cases, especially one plane s...
详细信息
ISBN:
(纸本)0769506623
We present several methods for the estimation of relative pose between planes and cameras, based on projections of sets of coplanar features in images. While such methods exist for simple cases, especially one plane seen in one or several views, the aim of this paper is to propose solutions for multi-plane multi-view situations, possibly with little overlap. We propose a factorization-based method for the general case of n planes seen in m views. A mechanism for computing missing data, i.e. when one or several of the planes are not visible in one or several of the images, is described. Experimental results for real images are shown.
Correlation-based real-time stereo systems have been proven to be effective in applications such as robot navigation, elevation map building etc. This paper provides an in-depth analysis of the major error sources for...
详细信息
ISBN:
(纸本)0780342364
Correlation-based real-time stereo systems have been proven to be effective in applications such as robot navigation, elevation map building etc. This paper provides an in-depth analysis of the major error sources for such a real-time stereo system in the context of cross-country navigation of an autonomous vehicle. Three major types of errors: foreshortening error, misalignment error and systematic error, are identified. The combined disparity errors can easily exceed three-tenths of a pixel, which translates to significant range errors. Upon understanding these error sources, we demonstrate different approaches to either correct them or model their magnitudes without excessive additional computations. By correcting those errors, we show that the precision of the stereo algorithm can be improved by 50%.
The problem of non-parametric probability density function (PDF) estimation using Radial Basis Function (RBF) Neural Networks is addressed here. We investigate two criteria, based on a modified Kullback-Leibler distan...
详细信息
ISBN:
(纸本)0818672587
The problem of non-parametric probability density function (PDF) estimation using Radial Basis Function (RBF) Neural Networks is addressed here. We investigate two criteria, based on a modified Kullback-Leibler distance, that lead to an appropriate choice of the network architecture complexity. In the first criterion the modification consists in the addition of a term that penalizes complex architectures (MPL criterion). The second strategy involves the regularization of the network through the imposition of lower bounds on the standard deviation derived from conditions of existence of rejection tests (LBSD criterion). Experimental results indicate that the MPL criterion outperforms-the LBSD method.
Scene classification is a major open challenge in machine vision. Most solutions proposed so far such as those based on color histograms and local texture statistics cannot capture a scene's global configuration, ...
详细信息
ISBN:
(纸本)0780342364
Scene classification is a major open challenge in machine vision. Most solutions proposed so far such as those based on color histograms and local texture statistics cannot capture a scene's global configuration, which is critical in perceptual judgments of scene similarity. We present a novel approach, ''configural recognition'', for encoding scene class structure. The approach's main feature is its use of qualitative spatial and photometric relationships within and across regions in low resolution images. The emphasis on qualitative measures leads to enhanced generalization abilities and the use of low-resolution images renders the scheme computationally efficient. We present results on a large database of natural scenes. We also describe how qualitative scene concepts may be learned from examples.
CoMoGAN is a continuous GAN relying on the unsupervised reorganization of the target data on a functional manifold. To that matter, we introduce a new Functional Instance Normalization layer and residual mechanism, wh...
详细信息
ISBN:
(纸本)9781665445092
CoMoGAN is a continuous GAN relying on the unsupervised reorganization of the target data on a functional manifold. To that matter, we introduce a new Functional Instance Normalization layer and residual mechanism, which together disentangle image content from position on target manifold. We rely on naive physics-inspired models to guide the training while allowing private model/translations features. CoMoGAN can be used with any GAN backbone and allows new types of image translation, such as cyclic image translation like timelapse generation, or detached linear translation. On all datasets, it outperforms the literature.
作者:
Caglioti, VPolitecn Milan
Dipartimento Elettron & Informazione AI & Robot Project I-20133 Milan Italy
The space requirements for indexing under perspecive projections are addressed. It is known that the surface representing the set of possible images of a model point set within the index space must be three-dimensiona...
详细信息
ISBN:
(纸本)0769506623
The space requirements for indexing under perspecive projections are addressed. It is known that the surface representing the set of possible images of a model point set within the index space must be three-dimensional [1]. Under affine projections, the representing surface can be factorized as the cartesian product of lower-dimensional surfaces: these are obtained by projecting the representing surface onto orthogonal subspaces of the index space [2] [5]. This paper shows that, under perspective, such a factorization does not exist, yielding a negative answer to a question left open in [1]. However, it is shown that there exist subspaces of the index space, onto which the representing surface projection is two-dimensional.
暂无评论