We describe a method for training object detectors using a generalization of the cascade architecture, which results in a detection rate and speed comparable to that of the best published detectors while allowing for ...
详细信息
ISBN:
(纸本)0769523722
We describe a method for training object detectors using a generalization of the cascade architecture, which results in a detection rate and speed comparable to that of the best published detectors while allowing for easier training and a detector with fewer features. In addition, the method allows for quickly calibrating the detector for a target detection rate, false positive rate or speed. One important advantage of our method is that it enables systematic exploration of the ROC Surface, which characterizes the trade-off between accuracy and speed for a given classifier.
In this paper, we focus on face recognition over image sets, where each set is represented by a linear subspace. Linear Discriminant Analysis (LDA) is adopted for discriminative learning. After investigating the relat...
详细信息
ISBN:
(纸本)9781424439942
In this paper, we focus on face recognition over image sets, where each set is represented by a linear subspace. Linear Discriminant Analysis (LDA) is adopted for discriminative learning. After investigating the relation between regularization on Fisher Criterion and Maximum Margin Criterion, we present a unified framework for regularized LDA. With the framework, the ratio-form maximization of regularized Fisher LDA can be reduced to the difference form optimization with an additional constraint. By incorporating the empirical loss as the regularization term, we introduce a generalized Square Loss based Regularized LDA (SLR-LDA) with suggestion on parameter setting. Our approach achieves superior performance to the state-of-the-art methods on face recognition. Its effectiveness is also evidently verified in general object and object category recognition experiments.
Many computervision and patternrecognition problems may be posed by defining a way of measuring dissimilarities between patterns. For many types of data, these dissimilarities are not Euclidean, and may not be metri...
详细信息
ISBN:
(纸本)9781424469840
Many computervision and patternrecognition problems may be posed by defining a way of measuring dissimilarities between patterns. For many types of data, these dissimilarities are not Euclidean, and may not be metric. In this paper, we provide a means of embedding such data. We aim to embed the data on a hypersphere whose radius of curvature is determined by the dissimilarity data. The hypersphere can be either of positive curvature (elliptic) or of negative curvature (hyperbolic). We give an efficient method for solving the elliptic and hyperbolic embedding problems on symmetric dissimilarity data. This method gives the radius of curvature and a method for approximating the objects as points on a hyperspherical manifold. We apply our method to a variety of data including shape-similarities, graph-similarity and gesture-similarity data. In each case the embedding maintains the local structure of the data while placing the points in a metric space.
Recent interest in developing online computervision algorithms is spurred in part by a growth of applications capable of generating large volumes of images and videos. These applications are rich sources of images an...
详细信息
ISBN:
(纸本)9781479943098
Recent interest in developing online computervision algorithms is spurred in part by a growth of applications capable of generating large volumes of images and videos. These applications are rich sources of images and video streams. Online vision algorithms for managing, processing and analyzing these streams need to rely upon streaming concepts, such as pipelines, to ensure timely and incremental processing of data. This paper is a first attempt at defining a formal stream algebra that provides a mathematical description of vision pipelines and describes the distributed manipulation of image and video streams. We also show how our algebra can effectively describe the vision pipelines of two state of the art techniques.
In this paper we address the problem of unconstrained Word Spotting in scene images. We train a Fully Convolutional Network to produce heatmaps of all the character classes. Then, we employ the Text Proposals approach...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
In this paper we address the problem of unconstrained Word Spotting in scene images. We train a Fully Convolutional Network to produce heatmaps of all the character classes. Then, we employ the Text Proposals approach and, via a rectangle classifier, detect the most likely rectangle for each query word based on the character attribute maps. We evaluate the proposed method on ICDAR2015 and show that it is capable of identifying and recognizing query words in natural scene images.
作者:
Caglioti, VPolitecn Milan
Dipartimento Elettron & Informazione AI & Robot Project I-20133 Milan Italy
The space requirements for indexing under perspecive projections are addressed. It is known that the surface representing the set of possible images of a model point set within the index space must be three-dimensiona...
详细信息
ISBN:
(纸本)0769506623
The space requirements for indexing under perspecive projections are addressed. It is known that the surface representing the set of possible images of a model point set within the index space must be three-dimensional [1]. Under affine projections, the representing surface can be factorized as the cartesian product of lower-dimensional surfaces: these are obtained by projecting the representing surface onto orthogonal subspaces of the index space [2] [5]. This paper shows that, under perspective, such a factorization does not exist, yielding a negative answer to a question left open in [1]. However, it is shown that there exist subspaces of the index space, onto which the representing surface projection is two-dimensional.
Towards the goal of realizing a generic automatic human activity recognition system, a new formalism is proposed. Activities are described by a chained hierarchical representation using three type of entities: image f...
详细信息
ISBN:
(纸本)0769506623
Towards the goal of realizing a generic automatic human activity recognition system, a new formalism is proposed. Activities are described by a chained hierarchical representation using three type of entities: image features, mobile object properties and scenarios. Taking image features of tracked moving regions from an image sequence as input, mobile object properties are first computed by specific methods ods while noise is suppressed by statistical methods. Scenarios are recognized from mobile object properties based on Bayesian analysis. A sequential occurance several scenarios are recognized by an algorithm using a probabilistic finite-state automation (a variant of structured HMM). The demonstration of the optimality of these recognition method is discussed. Finally, the validity and the effectiveness of our approach is demonstrated on both real-world and perturbed data.
Automatic video production of sports aims at producing an aesthetic broadcast of sporting events. We present a new video system able to automatically produce a smooth and pleasant broadcast of Basketball games using a...
详细信息
ISBN:
(数字)9781728193601
ISBN:
(纸本)9781728193601
Automatic video production of sports aims at producing an aesthetic broadcast of sporting events. We present a new video system able to automatically produce a smooth and pleasant broadcast of Basketball games using a single fixed 4K camera. The system automatically detects and localizes players, ball and referees, to recognize main action coordinates and game states yielding to a professional cameraman-like production of the basketball event. We also release a fully annotated dataset consisting of single 4K camera and twelve-camera videos of basketball games.
The proceedings contains 140 papers. The following topics are dealt with: calibration;image analysis;stereo;edge and feature extraction;representation;motion;patternrecognition;image analysis;image processing and app...
详细信息
ISBN:
(纸本)0818608625
The proceedings contains 140 papers. The following topics are dealt with: calibration;image analysis;stereo;edge and feature extraction;representation;motion;patternrecognition;image analysis;image processing and applications;geometry;motion;morphology;navigation;matching recognition;and, parallel processing.
We have been researching three dimensional (3D) ground-truth systems for performance evaluation of vision and perception systems in the fields of smart manufacturing and robot safety. In this paper we first present an...
详细信息
ISBN:
(纸本)9780769549903
We have been researching three dimensional (3D) ground-truth systems for performance evaluation of vision and perception systems in the fields of smart manufacturing and robot safety. In this paper we first present an overview of different systems that have been used to provide ground-truth (GT) measurements and then we discuss the advantages of physically-sensed ground-truth systems for our applications. Then we discuss in detail the three ground- truth systems that we have used in our experiments: ultra wide-band, indoor GPS, and a camera-based motion capture system. Finally, we discuss three different perception-evaluation experiments where we have used these GT systems
暂无评论