the proceedings contain 640 papers. the topics discussed include: generation and comprehension of unambiguous object description;image question answering using convolutional neural network with dynamic parameter predi...
ISBN:
(纸本)9781467388504
the proceedings contain 640 papers. the topics discussed include: generation and comprehension of unambiguous object description;image question answering using convolutional neural network with dynamic parameter prediction;neural module networks;learning to assign orientations to feature points;affinity CNN: learning pixel-centric pairwise relations for figure/ground embedding;occlusion boundary detection via deep exploration of context;exploit bounding box annotations for multi-label object recognition;MCMC shape sampling for image segmentation with nonparametric shape priors;learning action maps of large environments via first-person vision;sample and filter: nonparametric scene parsing via efficient filtering;training region-based object detectors with online hard example mining;learning with side information through modality hallucination;HyperNet: towards accurate region proposal generation and joint object detection;macroscopic interferometry: rethinking depth estimation with frequency-domain time-of-flight;ASP vision: optically computing the first layer of convolutional neural networks using angle sensitive pixels;hierarchical recurrent neural encoder for video representation with application to captioning;and from keyframes to key objects: video summarization by representative object proposal selection.
the proceedings contain 192 papers. the topics discussed include: feature vector compression based on least error quantization;a comprehensive analysis of deep learning based representation for face recognition;two-st...
ISBN:
(纸本)9781467388504
the proceedings contain 192 papers. the topics discussed include: feature vector compression based on least error quantization;a comprehensive analysis of deep learning based representation for face recognition;two-stream CNNs for gesture-based verification and identification: learning user style;CALIPER: continuous authentication layered with integrated PKI encoding recognition;latent fingerprint image segmentation using fractal dimension features and weighted extreme learning machine ensemble;a comparison of human and automated face verification accuracy on unconstrained image sets;offline signature verification based on bag-of-VisualWords model using KAZE features and weighting schemes;implementation of fixed-length template protection based on homomorphic encryption with application to signature biometrics;seeing the forest from the trees: a holistic approach to near-infrared heterogeneous face recognition;a novel visualization tool for evaluating the accuracy of 3d sensing and reconstruction algorithms for automatic dormant pruning applications;a pointing gesture based egocentric interaction system: dataset, approach and application;multimodal multi-stream deep learning for egocentric activity recognition;sparse kernel machines for discontinuous registration and nonstationary regularization;and accurate small deformation exponential approximant to integrate large velocity fields: application to image registration.
the task of Heterogeneous Face recognition consists inmatch face images that were sensed in different modalities, such as sketches to photographs, thermal images to photographs or near infrared to photographs. In this...
详细信息
ISBN:
(纸本)9781509014378
the task of Heterogeneous Face recognition consists inmatch face images that were sensed in different modalities, such as sketches to photographs, thermal images to photographs or near infrared to photographs. In this preliminary work we introduce a novel and generic approach based on Inter-session Variability Modelling to handle this task. the experimental evaluation conducted with two different image modalities showed an average rank-1 identification rates of 96.93% and 72.39% for the CUHK-CUFS (Sketches) and CASIA NIR-VIS 2.0 (Near infra-red) respectively. this work is totally reproducible and all the source code for this approach is made publicly available.
Face recognition (FR) is the most preferred mode for biometric-based surveillance, due to its passive nature of detecting subjects, amongst all different types of biometric traits. FR under surveillance scenario does ...
详细信息
ISBN:
(纸本)9781509014378
Face recognition (FR) is the most preferred mode for biometric-based surveillance, due to its passive nature of detecting subjects, amongst all different types of biometric traits. FR under surveillance scenario does not give satisfactory performance due to low contrast, noise and poor illumination conditions on probes, as compared to the training samples. A state-of-the-art technology, Deep Learning, even fails to perform well in these scenarios. We propose a novel soft-margin based learning method for multiple feature-kernel combinations, followed by feature transformed using Domain Adaptation, which outperforms many recent state-of-the-art techniques, when tested using three real-world surveillance face datasets.
Tattoo is a soft biometric that indicates discriminative characteristics of a person such as beliefs and personalities. Automatic detection and recognition of tattoo images is a difficult problem. We present deep conv...
详细信息
ISBN:
(纸本)9781509014378
Tattoo is a soft biometric that indicates discriminative characteristics of a person such as beliefs and personalities. Automatic detection and recognition of tattoo images is a difficult problem. We present deep convolutional neural network-based methods for automatic matching of tattoo images based on the AlexNet and Siamese networks. Furthermore, we show that rather than using a simple contrastive loss function, triplet loss function can significantly improve the performance of a tattoo matching system. Extensive experiments on a recently introduced Tatt-C dataset show that our method is able to capture the meaningful structure of tattoos and performs significantly better than many competitive tattoo recognition algorithms.
Perceiving distance from two camera images, a task called stereo vision, is fundamental for many applications in robotics or automation. However, algorithms that compute this information at high accuracy have a high c...
详细信息
ISBN:
(纸本)9781509014378
Perceiving distance from two camera images, a task called stereo vision, is fundamental for many applications in robotics or automation. However, algorithms that compute this information at high accuracy have a high computational complexity. One such algorithm, Semi Global Matching (SGM), performs well in many stereo vision benchmarks, while maintaining a manageable computational complexity. Nevertheless, CPU and GPU implementations of this algorithm often fail to achieve real-time processing of camera images, especially in power-constrained embedded environments. this work presents a novel architecture to calculate disparities through SGM. the proposed architecture is highly scalable and applicable for low-power embedded as well as high-performance multi-camera high-resolution applications.
e propose a two-level system for apparent age estimation from facial images. Our system first classifies samples into overlapping age groups. Within each group, the apparent age is estimated with local regressors, who...
详细信息
ISBN:
(纸本)9781509014378
e propose a two-level system for apparent age estimation from facial images. Our system first classifies samples into overlapping age groups. Within each group, the apparent age is estimated with local regressors, whose outputs are then fused for the final estimate. We use a deformable parts model based face detector, and features from a pre-trained deep convolutional network. Kernel extreme learning machines are used for classification. We evaluate our system on the ChaLearn Looking at People 2016 - Apparent Age Estimation challenge dataset, and report 0.3740 normal score on the sequestered test set.
In this paper, we present a distributed embedded vision system that enables surround scene analysis and vehicle threat estimation. the proposed system analyzes the surroundings of the ego-vehicle using four cameras, e...
详细信息
ISBN:
(纸本)9781509014378
In this paper, we present a distributed embedded vision system that enables surround scene analysis and vehicle threat estimation. the proposed system analyzes the surroundings of the ego-vehicle using four cameras, each connected to a separate embedded processor. Each processor runs a set of optimized vision-based techniques to detect surrounding vehicles, so that the entire system operates at real-time speeds. this setup has been demonstrated on multiple vehicle testbeds with high levels of robustness under real-world driving conditions and is scalable to additional cameras. Finally, we present a detailed evaluation which shows over 95% accuracy and operation at nearly 15 frames per second.
Improved dense trajectory features have been successfully used in video-based action recognition problems, but their application to face processing is more challenging. In this paper, we propose a novel system that de...
详细信息
ISBN:
(纸本)9781509014378
Improved dense trajectory features have been successfully used in video-based action recognition problems, but their application to face processing is more challenging. In this paper, we propose a novel system that deals withthe problem of emotion recognition in real-world videos, using improved dense trajectory, LGBP-TOP, and geometric features. In the proposed system, we detect the face and facial landmarks from each frame of a video using a combination of two recent approaches, and register faces by means of Procrustes analysis. the improved dense trajectory and geometric features are encoded using Fisher vectors and classification is achieved by extreme learning machines. We evaluate our method on the extended Cohn-Kanade (CK+) and EmotiW 2015 Challenge databases. We obtain state-of-the-art results in both databases.
暂无评论