In this paper, we describe an algorithm for object recognition that explicitly models and estimates the posterior probability function, P(object/image). We have chosen a functional form of the posterior probability fu...
详细信息
ISBN:
(纸本)0818684976
In this paper, we describe an algorithm for object recognition that explicitly models and estimates the posterior probability function, P(object/image). We have chosen a functional form of the posterior probability function that captures the joint statistics of local appearance and position on the object as well as the statistics of local appearance in the visual world at large. We use a discrete representation of local appearance consisting of approximately 10(6) patterns. We compute an estimate of P(object/image) in closed form by counting the frequency of occurrence of these patterns over various sets of training images. We have used this method for detecting human faces front frontal and profile views. The algorithm for frontal views has shown a detection rate of 93.0%,vith 88 false alarms on a set of 125 images containing 483 faces combining the MIT test set of Sung and Poggio with the CMU lest sets of Rowley, Baluja, and Kanade. The algorithm for detection of profile views has also demonstrated promising results.
Automatic video production of sports aims at producing an aesthetic broadcast of sporting events. We present a new video system able to automatically produce a smooth and pleasant broadcast of Basketball games using a...
详细信息
ISBN:
(数字)9781728193601
ISBN:
(纸本)9781728193601
Automatic video production of sports aims at producing an aesthetic broadcast of sporting events. We present a new video system able to automatically produce a smooth and pleasant broadcast of Basketball games using a single fixed 4K camera. The system automatically detects and localizes players, ball and referees, to recognize main action coordinates and game states yielding to a professional cameraman-like production of the basketball event. We also release a fully annotated dataset consisting of single 4K camera and twelve-camera videos of basketball games.
We present a class of statistical models for part-based object recognition that are explicitly parameterized according to the degree of spatial structure they can represent. These models provide a way of relating diff...
详细信息
ISBN:
(纸本)0769523722
We present a class of statistical models for part-based object recognition that are explicitly parameterized according to the degree of spatial structure they can represent. These models provide a way of relating different spatial priors that have been used for recognizing generic classes of objects, including joint Gaussian models and tree-structured models. By providing explicit control over the degree of spatial structure, our models make it possible to study the extent to which additional spatial constraints among parts are actually helpful in detection and localization, and to consider the tradeoff in representational power and computational cost. We consider these questions for object classes that have substantial geometric structure, such as airplanes, faces and motorbikes, using datasets employed by other researchers to facilitate evaluation. We find that for these classes of objects, a relatively small amount of spatial structure in the model can provide statistically indistinguishable recognition performance from more powerful models, and at a substantially lower computational cost.
We study event-based sensors in the context of spacecraft guidance and control during a descent on Moon-like terrains. For this purpose, we develop a simulator reproducing the event-based camera outputs when exposed t...
详细信息
ISBN:
(纸本)9781665448994
We study event-based sensors in the context of spacecraft guidance and control during a descent on Moon-like terrains. For this purpose, we develop a simulator reproducing the event-based camera outputs when exposed to synthetic images of a space environment. We find that it is possible to reconstruct, in this context, the divergence of optical flow vectors (and therefore the time to contact) and use it in a simple control feedback scheme during simulated descents. The results obtained are very encouraging, albeit insufficient to meet the stringent safety constraints and modelling accuracy imposed upon space missions. We thus conclude by discussing future work aimed at addressing these limitations.
We propose a model-based tracking method, called appearance-guided particle filtering (AGPF), which integrates both sequential motion transition information and appearance information. A probability propagation model ...
详细信息
Typical diffusion models are trained to accept a particular form of conditioning, most commonly text, and cannot be conditioned on other modalities without retraining. In this work, we propose a universal guidance alg...
详细信息
ISBN:
(纸本)9798350302493
Typical diffusion models are trained to accept a particular form of conditioning, most commonly text, and cannot be conditioned on other modalities without retraining. In this work, we propose a universal guidance algorithm that enables diffusion models to be controlled by arbitrary guidance modalities without the need to retrain any use-specific components. We show that our algorithm successfully generates quality images with guidance functions including segmentation, face recognition, object detection, and classifier signals. Code is available at ***/arpitbansal297/UniversalGuided-Diffusion.
We propose SCVRL, a novel contrastive-based framework for self-supervised learning for videos. Differently from previous contrast learning based methods that mostly focus on learning visual semantics (e.g., CVRL), SCV...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
We propose SCVRL, a novel contrastive-based framework for self-supervised learning for videos. Differently from previous contrast learning based methods that mostly focus on learning visual semantics (e.g., CVRL), SCVRL is capable of learning both semantic and motion patterns. For that, we reformulate the popular shuffling pretext task within a modern contrastive learning paradigm. We show that our transformer-based network has a natural capacity to learn motion in self-supervised settings and achieves strong performance, outperforming CVRL on four benchmarks.
recognition of Handwritten Mathematical Expressions (HMEs) is a challenging problem because of the complicated structure and uncommon math symbols contained in HMEs. Moreover, the lack of training data is a serious is...
详细信息
ISBN:
(数字)9781728193601
ISBN:
(纸本)9781728193601
recognition of Handwritten Mathematical Expressions (HMEs) is a challenging problem because of the complicated structure and uncommon math symbols contained in HMEs. Moreover, the lack of training data is a serious issue, especially for deep learning-based systems. In this paper, we proposed a dual loss attention model that utilizes the existing latex corpus to improve accuracy. The proposed dual loss attention has two losses, including decoder loss and context matching loss to learn semantic invariant features for the encoder and latex grammar for the decoder from handwritten and printed MEs. The results of experiments on the CROHME 2014 and 2016 databases demonstrate the superiority and effectiveness of our proposed model. These results are competitive compared to others reported in recent literature.
In this paper we present an extensive evaluation of instance segmentation in the context of images containing clothes. We propose a multi level evaluation that completes the classical overlapping criteria given by IoU...
详细信息
ISBN:
(纸本)9781665448994
In this paper we present an extensive evaluation of instance segmentation in the context of images containing clothes. We propose a multi level evaluation that completes the classical overlapping criteria given by IoU. In particular, we quantify both the contour and color content accuracy of the the predicted segmentation masks. We demonstrate that the proposed evaluation framework is relevant to obtain meaningful insights on models performance through experiments conducted on five state of the art instance segmentation methods.
In this paper we describe a practical approach to processor selection for embedded computervision applications. This approach is based on expected production volumes and other requirements. We then present several po...
详细信息
暂无评论