In this contribution we introduce a new model-free method for object tracking. The tracking is posed as a segmentation problem which we solve using the watershed algorithm. A framework is defined to compute the requir...
详细信息
In this contribution we introduce a new model-free method for object tracking. The tracking is posed as a segmentation problem which we solve using the watershed algorithm. A framework is defined to compute the required topographic surface from distances to the predicted contour, intensity edges and motion edges. This multifeature tracking approach yields accurate results in the presence of object corners, image clutter, and camera motion. Results on real sequences confirm the stability and robustness of the method. Objects are tracked over long sequences and in the presence of fast object motion.
We define a process called congealing in which elements of a dataset (images) are brought into correspondence with each other jointly, producing a data-defined model. It is based upon minimizing the summed component-w...
详细信息
We define a process called congealing in which elements of a dataset (images) are brought into correspondence with each other jointly, producing a data-defined model. It is based upon minimizing the summed component-wise (pixel-wise) entropies over a continuous set of transforms on the data. One of the biproducts of this minimization is a set of transforms, one associated with each original training sample. We then demonstrate a procedure for effectively bringing test data into correspondence with the data-defined model produced in the congealing process. Subsequently, we develop a probability density over the set of transforms that arose from the congealing process. We suggest that this density over transforms may be shared by many classes, and demonstrate how using this density as 'prior knowledge' can be used to develop a classifier based on only a single training example for each class.
We propose a template-matching method in which a parametric template space is constructed from a given set of template images then quickly matched to a reference image. In this method, geometrical changes in an image ...
详细信息
We propose a template-matching method in which a parametric template space is constructed from a given set of template images then quickly matched to a reference image. In this method, geometrical changes in an image object due to translation, rotation or scaling, and non-geometrical changes such as illumination variations or individual variations between objects are represented by template images in the parametric template space. The method also provides a subpixel matching scheme that yields a high-precision estimation of the object position. Experiments using real images have confirmed the effectiveness of the method.
Multi-resolution techniques have been used in a wide range of vision applications. Unfortunately, the costly operation of building a proper pyramid strongly reduces its value as a tool for reducing computational cost....
详细信息
Multi-resolution techniques have been used in a wide range of vision applications. Unfortunately, the costly operation of building a proper pyramid strongly reduces its value as a tool for reducing computational cost. A new approach, physical panoramic pyramid, is introduced in this paper. Physical panoramic pyramid measures multiple resolutions simultaneously resulting in multi-resolution panoramic images. No computation is needed to construct these image pyramids. We also analyze general noise sensitivity in image pyramids, including the interaction of the loss of resolution, random background noise and aliasing noise. The paper also discusses the issue of indexing between the neighboring layer, the viewpoint variation and the applications of the physical panoramic pyramid.
In this paper, we describe a statistical method for 3D object detection. We represent the statistics of both object appearance and 'non-object' appearance using a product of histograms. Each histogram represen...
详细信息
In this paper, we describe a statistical method for 3D object detection. We represent the statistics of both object appearance and 'non-object' appearance using a product of histograms. Each histogram represents the joint statistics of a subset of wavelet coefficients and their position on the object. Our approach is to use many such histograms representing a wide variety of visual attributes. Using this method, we have developed the first algorithm that can reliably detect human faces with out-of-plane rotation and the first algorithm that can reliably detect passenger cars over a wide range of viewpoints.
Humans have an innate ability to perceive symmetry, but it is not obvious how to automate this powerful insight. In this paper the mathematical theory of Frieze and wallpaper groups is used to extract visually meaning...
详细信息
Humans have an innate ability to perceive symmetry, but it is not obvious how to automate this powerful insight. In this paper the mathematical theory of Frieze and wallpaper groups is used to extract visually meaningful building blocks (motifs) from a repeated pattern. A novel peak detection algorithm based on 'regions of dominance' is used to automatically detect the underlying translational lattice of a repeated pattern. Following automatic classification of the pattern's symmetry group, knowledge of the interplay between rotation, reflection, glide-reflection and translation in that group leads to a small set of candidate motifs that exhibit local symmetry consistent with the global symmetry of the entire pattern. Although other work has addressed detection of the translational lattice of a repeated pattern, ours is the first to seek a principled method for determining a representative motif. Experiments show that the resulting pattern motifs conform well with human perception.
We propose an on-line handwriting recognition approach that integrates local bottom-up constructs with a global top-down measure into a modular recognition engine. The bottom-up process uses local point features for h...
详细信息
We propose an on-line handwriting recognition approach that integrates local bottom-up constructs with a global top-down measure into a modular recognition engine. The bottom-up process uses local point features for hypothesizing character segmentations and the top-down part performs shape matching for evaluating the segmentations. The shape comparison, called Fisher segmental matching, is based on Fisher's linear discriminant analysis. Along with an efficient ligature modeling, the segmentations and their matching scores are integrated into a recognition engine termed Hypotheses Propagation Network, which runs a variant of topological sort algorithm of graph search. The result is a system that is more shape-oriented, less dependent on local and temporal features, modular in construction and has a rich range of opportunities for further extensions. Our system currently performs at 95% of recognition rate on cursive scripts with a 460-words dictionary.
A learning account for the problem of object recognition is developed within the PAC (Probably Approximately Correct) model of learnability. The proposed approach makes no assumptions on the distribution of the observ...
详细信息
A learning account for the problem of object recognition is developed within the PAC (Probably Approximately Correct) model of learnability. The proposed approach makes no assumptions on the distribution of the observed objects, but quantifies success relative to its past experience. Most importantly, the success of learning an object representation is naturally tied to the ability to represent it as a function of some intermediate representations extracted from the image. We evaluate this approach in a large scale experimental study in which the SNoW learning architecture is used to learn representations for the 100 objects in the Columbia Object Image Database (COIL-100). The SNoW-based method is shown to outperform other methods in terms of recognition rates;its performance degrades gracefully when the training data contains fewer views and in the presence of occlusion noise.
This article presents a mathematical paradigm called Data Driven Markov Chain Monte Carlo (DDMCMC) for object recognition. The objectives of this paradigm are two-fold. Firstly, it realizes traditional 'hypothesis...
详细信息
This article presents a mathematical paradigm called Data Driven Markov Chain Monte Carlo (DDMCMC) for object recognition. The objectives of this paradigm are two-fold. Firstly, it realizes traditional 'hypothesis-and-test' methods through well-balanced Markov chain monte Carlo (MCMC) dynamics, thus it achieves robust and globally optimal solutions. Secondly, it utilizes data-driven (bottom-up) methods in computervision, such as Hough transform and data clustering, to design effective transition probabilities for Markov chain dynamics. This drastically improves the effectiveness of traditional MCMC algorithms in terms of two standard metrics: 'burn-in' period and 'mixing' rate. The article proceeds in three steps. Firstly, we analyze the structures of the solution space Ω for object recognition. Ω is decomposed into a large number of subspaces of varying dimensions in a hierarchy. Secondly, we use data-driven techniques to compute importance proposal probabilities in these spaces, each expressed in a non-parametric form using weighted samples or particles. Thirdly, Markov chains are designed to travel in such heterogeneous structured solution space, with both jump and diffusion dynamics. We use possibly the simplest objects - the 'Ψ-world' as an example to illustrate the concepts, and we briefly present results on an application of traffic sign detection.
The Core Experiment CE-Shape-1 for shape descriptors performed for the MPEG-7 standard gave a unique opportunity to compare various shape descriptors for non-rigid shapes with a single closed contour. There are two ma...
详细信息
The Core Experiment CE-Shape-1 for shape descriptors performed for the MPEG-7 standard gave a unique opportunity to compare various shape descriptors for non-rigid shapes with a single closed contour. There are two main differences with respect to other comparison results reported in the literature: (1) For each shape descriptor, the experiments were carried out by an institute that is in favor of this descriptor. This implies that the parameters for each system were optimally determined and the implementations were thoroughly tested. (2) It was possible to compare the performance of shape descriptors based on totally different mathematical approaches. A more theoretical comparison of these descriptors seems to be extremely hard. In this paper we report on the MPEG-7 Core Experiment CE-Shape-1.
暂无评论