We present a polarimetric thermal face database, the first of its kind, for face recognition research. This database was acquired using a polarimetric longwave infrared imager, specifically a division-of-time spinning...
详细信息
ISBN:
(纸本)9781509014378
We present a polarimetric thermal face database, the first of its kind, for face recognition research. This database was acquired using a polarimetric longwave infrared imager, specifically a division-of-time spinning achromatic retarder system. A corresponding set of visible spectrum imagery was also collected, to facilitate cross-spectrum (also referred to as heterogeneous) face recognition research. The database consists of imagery acquired at three distances under two experimental conditions: neutral/baseline condition, and expressions condition. Annotations (spatial coordinates of key fiducial points) are provided for all images. Cross-spectrum face recognition performance on the database is benchmarked using three techniques: partial least squares, deep perceptual mapping, and coupled neural networks.
Video surveillance systems generated about 65% of the Universe Big Data in 2015. The development of systems for intelligent analysis of such a large amount of data is among the most investigated topics in the academia...
详细信息
ISBN:
(纸本)9781509014378
Video surveillance systems generated about 65% of the Universe Big Data in 2015. The development of systems for intelligent analysis of such a large amount of data is among the most investigated topics in the academia and commercial world. Recent outcomes in knowledge management and computational intelligence demonstrate the effectiveness of semantic technologies in several fields like image and text analysis, hand writing and speech recognition. In this paper a solution that, starting from the output of a people tracking algorithm, is able to recognize simple events (person falling to the ground) and complex ones (person aggression) is presented. The proposed solution uses semantic web technologies for automatically annotating the output produced by the tracking algorithm;a sets of rules for reasoning on these annotated data are also proposed. Such rules allow to define complex analytics functions demonstrating the effectiveness of hybrid approaches for event recognition.
Motivated by the success of CNNs in object recognition on images, researchers are striving to develop CNN equivalents for learning video features. However, learning video features globally has proven to be quite a cha...
详细信息
ISBN:
(纸本)9781509014378
Motivated by the success of CNNs in object recognition on images, researchers are striving to develop CNN equivalents for learning video features. However, learning video features globally has proven to be quite a challenge due to the difficulty of getting enough labels, processing large-scale video data, and representing motion information. Therefore, we propose to leverage effective techniques from both data-driven and data-independent approaches to improve action recognition system. Our contribution is three-fold. First, we explicitly show that local handcrafted features and CNNs share the same convolution-pooling network structure. Second, we propose to use independent subspace analysis (ISA) to learn descriptors for state-of-the-art handcrafted features. Third, we enhance ISA with two new improvements, which make our learned descriptors significantly outperform the handcrafted ones. Experimental results on standard action recognition benchmarks show competitive performance.
We propose a novel approach to template based face recognition. Our dual goal is to both increase recognition accuracy and reduce the computational and storage costs of template matching. To do this, we leverage on an...
详细信息
ISBN:
(纸本)9781509014378
We propose a novel approach to template based face recognition. Our dual goal is to both increase recognition accuracy and reduce the computational and storage costs of template matching. To do this, we leverage on an approach which was proven effective in many other domains, but, to our knowledge, never fully explored for face images: average pooling of face photos. We show how (and why!) the space of a template's images can be partitioned and then pooled based on image quality and head pose and the effect this has on accuracy and template size. We perform extensive tests on the IJB-A and Janus CS2 template based face identification and verification benchmarks. These show that not only does our approach outperform published state of the art despite requiring far fewer cross template comparisons, but also, surprisingly, that image pooling performs on par with deep feature pooling.
This study focuses on the problem of extracting consistent and accurate face bounding box annotations from crowdsourced workers. Aiming to provide benchmark datasets for facial recognition training and testing, we cre...
详细信息
ISBN:
(纸本)9781509014378
This study focuses on the problem of extracting consistent and accurate face bounding box annotations from crowdsourced workers. Aiming to provide benchmark datasets for facial recognition training and testing, we create a 'gold standard' set against which consolidated face bounding box annotations can be evaluated. An evaluation methodology based on scores for several features of bounding box annotations is presented and is shown to predict consolidation performance using information gathered from crowdsourced annotations. Based on this foundation, we present "Grouper," a method leveraging density-based clustering to consolidate annotations by crowd workers. We demonstrate that the proposed consolidation scheme, which should be extensible to any number of region annotation consolidations, improves upon metadata released with the IARPA Janus Benchmark-A. Finally, we compare FR performance using the originally provided IJB-A annotations and Grouper and determine that similarity to the gold standard as measured by our evaluation metric does predict recognition performance.
Automatic facial expression recognition (FER) plays an important role in many fields. However, most existing FER techniques are devoted to the tasks in the constrained conditions, which are different from actual emoti...
详细信息
ISBN:
(纸本)9781509014378
Automatic facial expression recognition (FER) plays an important role in many fields. However, most existing FER techniques are devoted to the tasks in the constrained conditions, which are different from actual emotions. To simulate the spontaneous expression, the number of samples in acted databases is usually small, which limits the ability of facial expression classification. In this paper, a novel database for natural facial expression is constructed leveraging the social images and then a deep model is trained based on the naturalistic dataset. An amount of social labeled images are obtained from the image search engines by using specific keywords. The algorithms of junk image cleansing are then utilized to remove the mislabeled images. Based on the collected images, the deep convolutional neural networks are learned to recognize these spontaneous expressions. Experiments show the advantages of the constructed dataset and deep approach.
This paper proposes the use of multiple low-cost visual sensors to obtain a surround view of the ego-vehicle for semantic understanding. A multi-perspective view will assist the analysis of naturalistic driving studie...
详细信息
ISBN:
(纸本)9781509014378
This paper proposes the use of multiple low-cost visual sensors to obtain a surround view of the ego-vehicle for semantic understanding. A multi-perspective view will assist the analysis of naturalistic driving studies (NDS), by automating the task of data reduction of the observed sequences into events. A user-centric vision-based framework is presented using a vehicle detector and tracker in each separate perspective. Multi-perspective trajectories are estimated and analyzed to extract 14 different events, including potential dangerous behaviors such as overtakes and cut-ins. The system is tested on ten sequences of real-world data collected on U.S. highways. The results show the potential use of multiple low-cost visual sensors for semantic understanding around the ego-vehicle.
Face alignment can fail in real-world conditions, negatively impacting the performance of automatic facial expression recognition (FER) systems. In this study, we assume a realistic situation including non-alignable f...
详细信息
ISBN:
(纸本)9781509014378
Face alignment can fail in real-world conditions, negatively impacting the performance of automatic facial expression recognition (FER) systems. In this study, we assume a realistic situation including non-alignable faces due to failures in facial landmark detection. Our proposed approach fuses information about non-aligned and aligned facial states, in order to boost FER accuracy and efficiency. Six experimental scenarios using discriminative deep convolutional neural networks (DCNs) are compared, and causes for performance differences are identified. To handle non-alignable faces better, we further introduce DCNs that learn a mapping from non-aligned facial states to aligned ones, alignment-mapping networks (AMNs). We show that AMNs represent geometric transformations of face alignment, providing features beneficial for FER. Our automatic system based on ensembles of the discriminative DCNs and the AMNs achieves impressive results on a challenging database for FER in the wild.
Automatic facial expression recognition (FER) is an important component of affect-aware technologies. Because of the lack of labeled spontaneous data, majority of existing automated FER systems were trained on posed f...
详细信息
ISBN:
(纸本)9781509014378
Automatic facial expression recognition (FER) is an important component of affect-aware technologies. Because of the lack of labeled spontaneous data, majority of existing automated FER systems were trained on posed facial expressions;however in real-world applications we deal with (subtle) spontaneous facial expression. This paper introduces an extension of DISFA, a previously released and well-accepted face dataset. Extended DISFA (DISFA+) has the following features: 1) it contains a large set of posed and spontaneous facial expressions data for a same group of individuals, 2) it provides the manually labeled frame-based annotations of 5-level intensity of twelve FACS facial actions, 3) it provides meta data (i.e. facial landmark points in addition to the self-report of each individual regarding every posed facial expression). This paper introduces and employs DISFA+, to analyze and compare temporal patterns and dynamic characteristics of posed and spontaneous facial expressions.
The family of real-time face representations is obtained via Convolutional Network with Hashing Forest (CNHF). We learn the CNN, then transform CNN to the multiple convolution architecture and finally learn the output...
详细信息
ISBN:
(纸本)9781509014378
The family of real-time face representations is obtained via Convolutional Network with Hashing Forest (CNHF). We learn the CNN, then transform CNN to the multiple convolution architecture and finally learn the output hashing transform via new Boosted Hashing Forest (BHF) technique. This BHF generalizes the Boosted SSC approach for hashing learning with joint optimization of face verification and identification. CNHF is trained on CASIA-WebFace dataset and evaluated on LFW dataset. We code the output of single CNN with 97% on LFW. For Hamming embedding we get CBHF-200 bit (25 byte) code with 96.3% and 2000-bit code with 98.14% on LFW. CNHF with 2000X7-bit hashing trees achieves 93% rank-1 on LFW relative to basic CNN 89.9% rank-1. CNHF generates templates at the rate of 40+ fps with CPU Core i7 and 120+ fps with GPU GeForce GTX 650.
暂无评论