The three articles in this special section are selected papers from the ieee CS conference on computervision and patternrecognition that was held in Anchorage, AL, in June 2008.
The three articles in this special section are selected papers from the ieee CS conference on computervision and patternrecognition that was held in Anchorage, AL, in June 2008.
We present a novel image operator that seeks to find the value of stroke width for each image pixel, and demonstrate its use on the task of text detection in natural images. The suggested operator is local and data de...
详细信息
ISBN:
(纸本)9781424469840
We present a novel image operator that seeks to find the value of stroke width for each image pixel, and demonstrate its use on the task of text detection in natural images. The suggested operator is local and data dependent, which makes it fast and robust enough to eliminate the need for multi-scale computation or scanning windows. Extensive testing shows that the suggested scheme outperforms the latest published algorithms. Its simplicity allows the algorithm to detect texts in many fonts and languages.
Three different statistical models of colour data for use in segmentation or tracking algorithms are proposed. Results of a performance comparison of a tracking algorithm, applied to two separate applications, using e...
详细信息
ISBN:
(纸本)0780342364
Three different statistical models of colour data for use in segmentation or tracking algorithms are proposed. Results of a performance comparison of a tracking algorithm, applied to two separate applications, using each of the three different types of underlying model of the data are presented. From these a comparison of the performance of the statistical colour models themselves is obtained.
We present a new, efficient stereo algorithm addressing robust disparity estimation in the presence of occlusions. The algorithm is an adaptive, multi-window scheme using left-right consistency to compute disparity an...
详细信息
ISBN:
(纸本)0780342364
We present a new, efficient stereo algorithm addressing robust disparity estimation in the presence of occlusions. The algorithm is an adaptive, multi-window scheme using left-right consistency to compute disparity and its associated uncertainty. We demonstrate and discuss performances with both synthetic and real stereo pairs, and show how our results improve an those of closely related techniques for both robustness and efficiency.
We propose a face recognition approach based on hashing. The approach yields comparable recognition rates with the random l(1) approach [18], which is considered the state-of-the-art. But our method is much faster: it...
详细信息
ISBN:
(纸本)9781424469840
We propose a face recognition approach based on hashing. The approach yields comparable recognition rates with the random l(1) approach [18], which is considered the state-of-the-art. But our method is much faster: it is up to 150 times faster than [18] on the YaleB dataset. We show that with hashing, the sparse representation can be recovered with a high probability because hashing preserves the restrictive isometry property. Moreover, we present a theoretical analysis on the recognition rate of the proposed hashing approach. Experiments show a very competitive recognition rate and significant speedup compared with the state-of-the-art.
Winder et al. [15, 14] have recently shown the superiority of the DAISY descriptor [12] in comparison to other widely extended descriptors such as SIFT [8] and SURF [1]. Motivated by those results, we present a novel ...
详细信息
ISBN:
(纸本)9781424469840
Winder et al. [15, 14] have recently shown the superiority of the DAISY descriptor [12] in comparison to other widely extended descriptors such as SIFT [8] and SURF [1]. Motivated by those results, we present a novel algorithm that extracts viewpoint and illumination invariant keypoints and describes them with a particular implementation of a DAISY-like layout. We demonstrate how to efficiently compute the scale-space and re-use this information for the descriptor. Comparison to similar approaches such as SIFT and SURF show higher precision vs recall performance of the proposed method. Moreover, we dramatically reduce the computational cost by a factor of 6x and 3x, respectively. We also prove the use of the proposed method for computervision applications.
State-of-the-art motion estimation algorithms suffer from three major problems: Poorly textured regions, occlusions and small scale image structures. Based on the Gestalt principles of grouping we propose to incorpora...
详细信息
ISBN:
(纸本)9781424469840
State-of-the-art motion estimation algorithms suffer from three major problems: Poorly textured regions, occlusions and small scale image structures. Based on the Gestalt principles of grouping we propose to incorporate a low level image segmentation process in order to tackle these problems. Our new motion estimation algorithm is based on non-local total variation regularization which allows us to integrate the low level image segmentation process in a unified variational framework. Numerical results on the Middlebury optical flow benchmark data set demonstrate that we can cope with the aforementioned problems.
The design of robust classifiers, which can contend with the noisy and outlier ridden datasets typical of computervision, is studied. It is argued that such robustness requires loss functions that penalize both large...
详细信息
ISBN:
(纸本)9781424469840
The design of robust classifiers, which can contend with the noisy and outlier ridden datasets typical of computervision, is studied. It is argued that such robustness requires loss functions that penalize both large positive and negative margins. The probability elicitation view of classifier design is adopted, and a set of necessary conditions for the design of such losses is identified. These conditions are used to derive a novel robust Bayes-consistent loss, denoted Tangent loss, and an associated boosting algorithm, denoted TangentBoost. Experiments with data from the computervision problems of scene classification, object tracking, and multiple instance learning show that TangentBoost consistently outperforms previous boosting algorithms.
We present a method to classify and localize human actions in video using a Hough transform voting framework. Random trees are trained to learn a mapping between densely-sampled feature patches and their corresponding...
详细信息
ISBN:
(纸本)9781424469840
We present a method to classify and localize human actions in video using a Hough transform voting framework. Random trees are trained to learn a mapping between densely-sampled feature patches and their corresponding votes in a spatio-temporal-action Hough space. The leaves of the trees form a discriminative multi-class codebook that share features between the action classes and vote for action centers in a probabilistic manner. Using low-level features such as gradients and optical flow, we demonstrate that Hough-voting can achieve state-of-the-art performance on several datasets covering a wide range of action-recognition scenarios.
暂无评论