We address the problem of finding point correspondences in images by way of an approach to template matching that is robust under affine distortions. This is achieved by applying "geometric blur" to both the...
详细信息
We address the problem of finding point correspondences in images by way of an approach to template matching that is robust under affine distortions. This is achieved by applying "geometric blur" to both the template and the image, resulting in a fall-off in similarity that is close to linear in the norm of the distortion between the template and the image. Results in wide baseline stereo correspondence, face detection, and feature correspondence are included.
We aim for content-based image retrieval of texture objects in natural scenes under varying illumination and viewing conditions. To achieve this, image retrieval is based on matching feature distributions derived from...
详细信息
We aim for content-based image retrieval of texture objects in natural scenes under varying illumination and viewing conditions. To achieve this, image retrieval is based on matching feature distributions derived from color invariant gradients. To cope with object cluttering, region-based texture segmentation is applied on the target images prior to the actual image retrieval process. The retrieval scheme is empirically verified on color images taken from texture objects under different lighting conditions.
Tracking objects involves the modeling of non-linear non-Gaussian systems. On one hand, variants of Kalman filters are limited by their Gaussian assumptions. On the other hand, conventional particle filter, e.g., COND...
详细信息
ISBN:
(纸本)0769512720
Tracking objects involves the modeling of non-linear non-Gaussian systems. On one hand, variants of Kalman filters are limited by their Gaussian assumptions. On the other hand, conventional particle filter, e.g., CONDENSATION, uses transition prior as the proposal distribution. The transition prior does not take into account current observation data, and many particles can therefore be wasted in low likelihood area. To overcome these difficulties, unscented particle filter (UPF) has recently been proposed in the field of filtering theory. In this paper, we introduce the UPF framework into audio and visual tracking. The UPF uses the unscented Kalman filter to generate sophisticated proposal distributions that seamlessly integrate the current observation, thus greatly improving the tracking performance. To evaluate the efficacy of the UPF framework, we apply it in two real-world tracking applications. One is the audio-based speaker localization, and the other is the vision-based human tracking. The experimental results are compared against those of the widely used CONDENSATION approach and have demonstrated superior tracking performance.
Affine transformations are widely used for modelling relations between pairs of images. This paper presents a new frequency domain technique for estimating this kind of transformations. It consists of two main steps: ...
详细信息
ISBN:
(纸本)0780367251
Affine transformations are widely used for modelling relations between pairs of images. This paper presents a new frequency domain technique for estimating this kind of transformations. It consists of two main steps: 1) the affine matrix is first estimated by solving, with a coarse-to-fine strategy a suitable nonlinear minimization problem formulated upon the radial projections of the image energies;2) after compensating for the contribution of the affine matrix, the translation vector is then recovered by means of standard phase correlation. Experimental evidence of the effectiveness of this technique is reported and discussed.
This paper presents a new method for constructing models from a set of positive and negative sample images;the method requires no manual extraction of significant objects or features. Our model representation is based...
详细信息
ISBN:
(纸本)0769512720
This paper presents a new method for constructing models from a set of positive and negative sample images;the method requires no manual extraction of significant objects or features. Our model representation is based on two layers. The first one consists of "generic" descriptors which represent sets of similar rotational invariant feature vectors. Rotation invariance allows to group similar, but rotated patterns and makes the method robust to model deformations. The second layer is the joint probability on the frequencies of the "generic" descriptors over neighborhoods, This probability is multi-modal and is represented by a set of "spatial-frequency" clusters. It adds a statistical spatial constraint which is rotationally invariant. Our two-layer representation is novel;it allows to efficiently capture "texture-like" visual structure. The selection of distinctive structure determines characteristic model features (common to the positive and rare in the negative examples) and increases the performance of the model. Models are retrieved and localized using a probabilistic score. Experimental results for "textured" animals and faces show a very good performance for retrieval as well as localization.
This paper presents a complete algorithm for building geocoded terrestrial mosaics from aerial video accompanied by GPS/INS readings, without relying on ground survey points or reference imagery to provide geographic ...
详细信息
ISBN:
(纸本)0769512720
This paper presents a complete algorithm for building geocoded terrestrial mosaics from aerial video accompanied by GPS/INS readings, without relying on ground survey points or reference imagery to provide geographic control. The 2D mosaic-to-video frame mappings are jointly estimated by bundle adjustment of constraints from pose sensor data and interframe registrations. Multiple-swath video collections are handled by automatically registering spatially adjacent frames across swaths. The proposed approach optimally combines the pose and interframe constraints for geocoding, unlike existing 2D mosaic techniques, while avoiding the complexity of 3D reconstruction. The method was validated on two highly dissimilar operating scenarios, with quantitative evaluation against ground truth supplied by known geocoded reference imagery. One test over hilly terrain found median mosaic continuity and geocoding errors of 1.6 m and 3.1 m, respectively, which is acceptable for many tasks. Characterization of such errors is essential for acceptance of this video mosaic process in critical geospatial applications.
image preprocessing.is an important step in the area of imageprocessing.and patternrecognition. In this paper we present a CNN image preprocessing.algorithm which is especially suited for fingerprint recognition. It...
详细信息
Pointing at planar surfaces such as TV and computer monitors or projection screens can be a useful mode of interaction between humans and machines. To a large extent what seems to hinder the use of vision in such prac...
详细信息
ISBN:
(纸本)0769512720
Pointing at planar surfaces such as TV and computer monitors or projection screens can be a useful mode of interaction between humans and machines. To a large extent what seems to hinder the use of vision in such practical applications is the difficulty of the computational task, which is typically defined as 3-D reconstruction from uncalibrated 2-D images of a non-static scene. We describe below two designs where, using one or two cameras, the target of pointing on a flat monitor or screen is identified without 3-D inference, using only image morphing and line intersection. This is accomplished by registering the images with the target plane. When used to identify a pointing target on a surface hiden from the camera (e.g., a computer monitor which supports the camera itself as in most PC configurations), we add aperture(s) coplanar with the target surface in front of the camera(s). We describe experimental results showing a fully automated procedure for pointing target detection with high accuracy. The simplicity of our method and its robustness, as well as the relative accuracy of our results, can make pointing a practical means of human-machine interaction.
Dynamic textures are sequences of images that exhibit some form of temporal stationarity, such as waves, steam, and foliage. We pose the problem of recognizing and classifying dynamic textures in the space of dynamica...
详细信息
Dynamic textures are sequences of images that exhibit some form of temporal stationarity, such as waves, steam, and foliage. We pose the problem of recognizing and classifying dynamic textures in the space of dynamical systems where each dynamic texture is uniquely represented. Since the space is non-linear, a distance between models must be defined. We examine three different distances in the space of autoregressive models and assess their power.
In this paper we propose a new 3D kernel for the recovery of 3D-orientation signatures. The kernel is a Gaussian function defined in local spherical coordinates and its Cartesian support has the shape of a truncated c...
详细信息
ISBN:
(纸本)0769512720
In this paper we propose a new 3D kernel for the recovery of 3D-orientation signatures. The kernel is a Gaussian function defined in local spherical coordinates and its Cartesian support has the shape of a truncated cone with axis in radial direction and very small angular support. A set of such kernels is obtained by uniformly sampling the 2D space of polar and azimuth angles. The projection of a local neighborhood on such a kernel set produces a local 3D-orientation signature. In case of spatiotemporal analysis, such a kernel set can be applied either on the derivative space of a local neighborhood or on the local Fourier transform. The well known planes arising from one or multiple motions produce maxima in the orientation signature. Due to the kernel's local support spatiotemporal signatures possess higher orientation resolution than 3D steerable filters and motion maxima can be detected and localized more accurately. We describe and show in experiments the superiority of the proposed kernels compared to Hough transformation or EM-based multiple motion detection.
暂无评论