We investigate the problem of recognizing words from video, fingerspelled using the British Sign Language (BSL) fingerspelling alphabet. This is a challenging task since the BSL alphabet involves both hands occluding ...
详细信息
ISBN:
(纸本)9781424439942
We investigate the problem of recognizing words from video, fingerspelled using the British Sign Language (BSL) fingerspelling alphabet. This is a challenging task since the BSL alphabet involves both hands occluding each other and contains signs which are ambiguous from the observer's viewpoint. The main contributions of our work include: (i) recognition based on hand shape alone, not requiring motion cues;(ii) robust visual features for hand shape recognition;(iii) scalability to large lexicon recognition with no re-training. We report results on a dataset of 1,000 low quality web-cam videos of 100 words. The proposed method achieves a word recognition accuracy of 98.9%.
Laughter detection is an important area of interest in the Affective Computing and Human-computer Interaction fields. In this paper we propose a multi-modal methodology, based on the fusion of audio and visual cues to...
详细信息
ISBN:
(纸本)9781424439942
Laughter detection is an important area of interest in the Affective Computing and Human-computer Interaction fields. In this paper we propose a multi-modal methodology, based on the fusion of audio and visual cues to deal with the laughter recognition problem in face-to-face conversations. The audio features are extracted from the spectogram and the video features are obtained estimating the mouth movement degree and using a smile and laughter classifier Finally, the multi-modal cues are included in a sequential classifier Results over videos from the public discussion blog of the New York Times show that both types of features perform better when considered together by the classifier Moreover the sequential methodology shows to significantly, outperform the results obtained by an Adaboost classifier
Contextual models play a very important role in the task of object recognition. Over the years, two kinds of contextual models have emerged: models with contextual inference based on the statistical summary of the sce...
详细信息
ISBN:
(纸本)9781424439942
Contextual models play a very important role in the task of object recognition. Over the years, two kinds of contextual models have emerged: models with contextual inference based on the statistical summary of the scene (we will refer to these as Scene Based Context models, or SBC), and models representing the context in terms of relationships among objects in the image (Object Based Context, or OBC). In designing object recognition systems, it is necessary to understand the theoretical and practical properties of such approaches. This work provides an analysis of these models and evaluates two of their representatives using the LabelMe dataset. We demonstrate a considerable margin of improvement using the OBC style approach.
Variations in pose, expression, illumination, aging and disguise are considered as major challenges in face recognition and several techniques have been proposed to address these challenges. Plastic surgery, on the ot...
详细信息
ISBN:
(纸本)9781424439942
Variations in pose, expression, illumination, aging and disguise are considered as major challenges in face recognition and several techniques have been proposed to address these challenges. Plastic surgery, on the other hand, is considered as an arduous research issue;however, it has not yet been studied either theoretically, or experimentally This paper focuses on analyzing the effect of plastic surgery in face recognition algorithms. The preliminary study provides an experimental and analytical comparison of face recognition algorithms on a plastic surgery, database of 506 individuals. The experimental results indicate that existing face recognition algorithms perform poorly when matching pre and post surgery face images. The results also suggest that it is imperative for future face recognition systems to be able to address this important issue and hence there is a need for more research in this important area.
This paper addresses large-displacement-diffeomorphic mapping registration from an optimal control perspective. This viewpoint leads to two complementary formulations. One approach requires the explicit computation of...
详细信息
ISBN:
(纸本)9781424439942
This paper addresses large-displacement-diffeomorphic mapping registration from an optimal control perspective. This viewpoint leads to two complementary formulations. One approach requires the explicit computation of coordinate maps, whereas the other is formulated strictly in the image domain (thus making it also applicable to manifolds which require multiple coordinate charts). We discuss their intrinsic relation as well as the advantages and disadvantages of the two approaches. Further we propose a novel formulation for unbiased image registration, which naturally extends to the case of time-series of images. We discuss numerical implementation details and carefully evaluate the properties of the alternative algorithms.
The four papers in this special section are extended versions of award-winning papers from the 2007 ieeeconference on computervision and patternrecognition (cvpr 2007).
The four papers in this special section are extended versions of award-winning papers from the 2007 ieeeconference on computervision and patternrecognition (cvpr 2007).
This paper presents a unified framework for object detection, segmentation, and classification using regions. Region features are appealing in this context because: (1) they encode shape and scale information of objec...
详细信息
ISBN:
(纸本)9781424439928
This paper presents a unified framework for object detection, segmentation, and classification using regions. Region features are appealing in this context because: (1) they encode shape and scale information of objects naturally;(2) they are only mildly affected by background clutter Regions have not been popular as features due to their sensitivity to segmentation errors. In this paper, we start by producing a robust bag of overlaid regions for each image using Arbelaez et al., cvpr 2009. Each region is represented by a rich set of image cues (shape, color and texture). We then learn region weights using a max-margin framework. In detection and segmentation, we apply a generalized Hough voting scheme to generate hypotheses of object locations, scales and support, followed by a verification classifier and a constrained segmenter on each hypothesis. The proposed approach significantly outperforms the state of the art on the ETHZ shape database (87.1% average detection rate compared to Ferrari et al. 's 67.2%), and achieves competitive performance on the Caltech 101 database.
Detecting suspicious events from video surveillance cameras has been an important task recently. Many trajectory based descriptors were developed, such as to detect people running or moving in opposite direction. Howe...
详细信息
ISBN:
(纸本)9781424439942
Detecting suspicious events from video surveillance cameras has been an important task recently. Many trajectory based descriptors were developed, such as to detect people running or moving in opposite direction. However, these trajectory based descriptors are not working well in the crowd environments like airports, rail stations, because those descriptors assume perfect motion/object segmentation. In this paper, we present an event detection method using dynamic texture descriptor. The dynamic texture descriptor is an extension of the local binary patterns. The image sequences are divided into regions. A flow is formed based on the similarity of the dynamic texture descriptors on the regions. We used real dataset for experiments. The results are promising.
In the field of biometrics evaluation of quality of biometric samples has a number of important applications. The main applications include (1) to reject poor quality images during acquisition, (2) to use as enhanceme...
详细信息
ISBN:
(纸本)9781424439942
In the field of biometrics evaluation of quality of biometric samples has a number of important applications. The main applications include (1) to reject poor quality images during acquisition, (2) to use as enhancement metric, and (3) to apply as a weighting factor in fusion schemes. Since a biometric-based recognition system relies on measures of performance such as matching scores and recognition probability of error it becomes intuitive that the metrics evaluating biometric sample quality have to be linked to the recognition performance of the system. The goal of this work is to design a method for evaluating and ranking various quality metrics applied to biometric images or signals based on their ability to predict recognition performance of a biometric recognition system. The proposed method involves: (1) Preprocessing algorithm operating on pairs of quality scores and generating relative scores, (2) Adaptive multivariate mapping relating quality scores and measures of recognition performance and (3) Ranking algorithm that selects the best combinations of quality measures. The performance of the method is demonstrated on face and iris biometric data.
We describe a framework for face recognition at a distance based on sparse-stereo reconstruction. We develop a 3D acquisition system that consists of two CCD stereo cameras mounted on pan-tilt units with adjustable ba...
详细信息
ISBN:
(纸本)9781424439942
We describe a framework for face recognition at a distance based on sparse-stereo reconstruction. We develop a 3D acquisition system that consists of two CCD stereo cameras mounted on pan-tilt units with adjustable baseline. We first detect the facial region and extract its landmark points, which are used to initialize an AAM mesh fitting algorithm. The fitted mesh vertices provide point correspondences between the left and right images of a stereo pair;stereo-based reconstruction is then used to infer the 3D information of the mesh vertices. We perform experiments regarding the use of different features extracted from these vertices for face recognition. The cumulative rank curves (CMC), which are generated using the proposed framework, confirms the feasibility of the proposed work for long distance recognition of human faces with respect to the state-of-the-art.
暂无评论