We describe a method for training object detectors using a generalization of the cascade architecture, which results in a detection rate and speed comparable to that of the best published detectors while allowing for ...
详细信息
ISBN:
(纸本)0769523722
We describe a method for training object detectors using a generalization of the cascade architecture, which results in a detection rate and speed comparable to that of the best published detectors while allowing for easier training and a detector with fewer features. In addition, the method allows for quickly calibrating the detector for a target detection rate, false positive rate or speed. One important advantage of our method is that it enables systematic exploration of the ROC Surface, which characterizes the trade-off between accuracy and speed for a given classifier.
The choice of a color space is of great importance for many computervision algorithms (e.g. edge detection and object recognition). It induces the equivalence classes to the actual algorithms. However the problem is ...
详细信息
ISBN:
(纸本)0769523722
The choice of a color space is of great importance for many computervision algorithms (e.g. edge detection and object recognition). It induces the equivalence classes to the actual algorithms. However the problem is how to automatically select the color space that produces the best result for a particular task. The subsequent difficulty then is how to obtain a proper weighting scheme for the algorithms so that the results are combined in an optimal setting. To achieve proper color space selection and fusion of feature detectors, in this paper we propose a method that exploits non-perfect correlation between the color models derived from the principles of diversification. As a consequence, the weighting scheme yields maximal color discrimination. The method is verified experimentally for two different feature detectors. The experimental results show that the model provides feature detection results having a discriminative power of 30 percent higher than the standard weighting scheme.
Dimensionality reduction via feature projection has been widely used in patternrecognition and machine learning. It is often beneficial to derive the projections not only based on the inputs but also on the target va...
详细信息
ISBN:
(纸本)0769523722
Dimensionality reduction via feature projection has been widely used in patternrecognition and machine learning. It is often beneficial to derive the projections not only based on the inputs but also on the target values in the training data set. This is of particular importance in predicting multivariate or structured outputs. which is an area of growing interest. In this paper we introduce a novel projection framework which is sensitive to both input features and outputs. Based on the derived features prediction accuracy can be greatly improved. We validate our approach in two applications. The first is to model users ' preferences on a set of paintings. The second application is concerned with image categorization where each image may belong to multiple categories. The proposed algorithm produces very encouraging results in both settings.
One of the classic problems in low level vision is image restoration. An important contribution toward this effort has been the development of shock filters by Osher and Rudin [15]. It performs image de-blurring using...
详细信息
Three different statistical models of colour data for use in segmentation or tracking algorithms are proposed. Results of a performance comparison of a tracking algorithm, applied to two separate applications, using e...
详细信息
ISBN:
(纸本)0780342364
Three different statistical models of colour data for use in segmentation or tracking algorithms are proposed. Results of a performance comparison of a tracking algorithm, applied to two separate applications, using each of the three different types of underlying model of the data are presented. From these a comparison of the performance of the statistical colour models themselves is obtained.
We present an automotive-grade, real-time, vision-based Driver State Monitor. Upon detecting and tracking the driver's facial features, the system analyzes eye-closures and head pose to infer his/her fatigue or di...
详细信息
ISBN:
(纸本)0769523722
We present an automotive-grade, real-time, vision-based Driver State Monitor. Upon detecting and tracking the driver's facial features, the system analyzes eye-closures and head pose to infer his/her fatigue or distraction. This information is used to warn the driver and to modulate the actions of other safety systems. The purpose of this monitor is to increase road safety by preventing drivers from falling asleep or from being overly distracted, and to improve the effectiveness of other safety systems.
Landuse classification is an important problem in the remote sensing field. It can be used in a wide range of applications. In this paper we propose a hybrid method fusing edges and regions information for the landuse...
详细信息
ISBN:
(纸本)0769523722
Landuse classification is an important problem in the remote sensing field. It can be used in a wide range of applications. In this paper we propose a hybrid method fusing edges and regions information for the landuse classification of multispectral images. It mainly includes the steps of image pre-processing, initial segmentation and region merging. Especially, a novel spatial mean shift procedure is proposed so that some information can be extracted and used in the successive steps. Aiming at the multispectral images processing, we also design a band weighting strategy that give a proper weight to each band adaptively according to the region to be processed. Experimental results on the Landsat TM and ETM+ images validate the performance of the proposed method.
In this paper we present a method for computing the localization of a mobile robot with reference to a learning video sequence. The robot is first guided on a path by a human, while the camera records a monocular lear...
详细信息
ISBN:
(纸本)0769523722
In this paper we present a method for computing the localization of a mobile robot with reference to a learning video sequence. The robot is first guided on a path by a human, while the camera records a monocular learning sequence. Then a 3D reconstruction of the path and the environment is computed off line from the learning sequence. The 3D reconstruction is then used for computing the pose of the robot in real time (30 Hz) in autonomous navigation. Results from our localization method are compared to the ground truth measured with a differential GPS.
We present a new, efficient stereo algorithm addressing robust disparity estimation in the presence of occlusions. The algorithm is an adaptive, multi-window scheme using left-right consistency to compute disparity an...
详细信息
ISBN:
(纸本)0780342364
We present a new, efficient stereo algorithm addressing robust disparity estimation in the presence of occlusions. The algorithm is an adaptive, multi-window scheme using left-right consistency to compute disparity and its associated uncertainty. We demonstrate and discuss performances with both synthetic and real stereo pairs, and show how our results improve an those of closely related techniques for both robustness and efficiency.
暂无评论