Due to its invariance to monotonic grayscale transformation and simple computation, Local Binary pattern (LBP) is broadly used as feature extractor in face recognition tasks in recent years[3]. In previous work, peopl...
详细信息
The objective of this work is to automatically generate a large number of images for a specified object class. A multimodal approach employing both text, metadata, and visual features is used to gather many high-quali...
详细信息
The objective of this work is to automatically generate a large number of images for a specified object class. A multimodal approach employing both text, metadata, and visual features is used to gather many high-quality images from the Web. Candidate images are obtained by a text-based Web search querying on the object identifier (e.g., the word penguin). The Webpages and the images they contain are downloaded. The task is then to remove irrelevant images and rerank the remainder. First, the images are reranked based on the text surrounding the image and metadata features. A number of methods are compared for this reranking. Second, the top-ranked images are used as (noisy) training data and an SVM visual classifier is learned to improve the ranking further. We investigate the sensitivity of the cross-validation procedure to this noisy training data. The principal novelty of the overall method is in combining text/metadata and visual features in order to achieve a completely automatic ranking of the images. Examples are given for a selection of animals, vehicles, and other classes, totaling 18 classes. The results are assessed by precision/recall curves on ground-truth annotated data and by comparison to previous approaches, including those of Berg and Forsyth [5] and Fergus et al. [12].
In this paper, a novelty methodology for the representation and similarity measurement of sequential data is presented. First, a linear segmentation algorithm based on feature points is proposed. Then, two similarity ...
详细信息
In this paper we present our approach on selection of regions of interest in colonoscopy videos, which consists of three stages: Region Segmentation, Region Description and Region Classification, focusing on the Regio...
详细信息
ISBN:
(纸本)9789898425386
In this paper we present our approach on selection of regions of interest in colonoscopy videos, which consists of three stages: Region Segmentation, Region Description and Region Classification, focusing on the Region Segmentation stage. As part of our segmentation scheme, we introduce our region merging algorithm that takes into account our model of appearance of the polyp. As the results show, the output of this stage reduces the number of final regions and indicates the degree of information of these regions. Our approach appears to outperform state-of-the-art methods. Our results can be used to identify polyp-containing regions in the later stages.
Accurate detection of moving object provides a fundamental capability that drives numerous high-level computervision applications. In this paper, a novel algorithm is proposed to detect objects in widely varying ther...
详细信息
In this paper we use the erosion and dilation operators for characterizing 3D polygonal objects. The goal is to perform a similarity search in a set of distinct objects. The method applies successive dilations and ero...
详细信息
ISBN:
(纸本)9783642212567;9783642212574
In this paper we use the erosion and dilation operators for characterizing 3D polygonal objects. The goal is to perform a similarity search in a set of distinct objects. The method applies successive dilations and erosions of the meshes in order to compute the difference volume as a function of the size of the structuring element. Because of appropriate pre-processing, the resulting function is invariant to translation, rotation and mesh resolution. On a set of 32 complex objects with different mesh resolutions, the method achieved an average ranking rate of 1.47, with 23 objects ranked first and 6 objects ranked second.
The quality of a mosaic depends on the projective alignment of the images involved. After point-correspondences between the images have been established, bundle adjustment finds an alignment considered optimal under c...
详细信息
Lane detection is an important application of driver assistance. In this paper, a new technique for detecting lane markers that is able to cope with many complex conditions is presented. Some of these conditions inclu...
详细信息
ISBN:
(纸本)9780889868656
Lane detection is an important application of driver assistance. In this paper, a new technique for detecting lane markers that is able to cope with many complex conditions is presented. Some of these conditions include dynamic illumination, scattered shadows, and the presence of neighboring vehicles to name a few. The input image is first pre-processed with a perspective removal transformation followed by a color space conversion. Then, the core elements of the proposed technique consisting of template matching, lane region merging, elliptical projections, and parametric tracking are explained. A formal error metric used in performance evaluation is also introduced. Finally, quantitative analyses show that the developed system performs well in real-world driving conditions with variations in illumination, traffic, and road surface quality.
In this paper, we propose an automatic approach to simultaneously name faces and discover scenes in TV shows. We follow the multi-modal idea of utilizing script to assist video content understanding, but without using...
详细信息
The codebook based (bag-of-words) model is a widely applied model for image classification. We analyze recent coding strategies in this model, and find that saliency is the fundamental characteristic of coding. The sa...
详细信息
暂无评论