The proceedings contain 602 papers. The topics discussed include: going deeper with convolutions;propagated image filtering;web scale photo hash clustering on a single machine;supervised discrete hashing;what do 15,00...
ISBN:
(纸本)9781467369640
The proceedings contain 602 papers. The topics discussed include: going deeper with convolutions;propagated image filtering;web scale photo hash clustering on a single machine;supervised discrete hashing;what do 15,000 object categories tell us about classifying and localizing actions?;landmarks-based kernelized subspace alignment for unsupervised domain adaptation;blur kernel estimation using normalized color-line priors;a light transport model for mitigating multipath interference in time-of-flight sensors;traditional saliency reloaded: a good old model in new shape;automatic construction of robust spherical harmonic subspaces;leveraging stereo matching with learning-based confidence measures;saliency detection via cellular automata;and efficient sparse-to-dense optical flow estimation using a learned basis and layers.
Tattoo is a soft biometric that indicates discriminative characteristics of a person such as beliefs and personalities. Automatic detection and recognition of tattoo images is a difficult problem. We present deep conv...
详细信息
ISBN:
(纸本)9781509014378
Tattoo is a soft biometric that indicates discriminative characteristics of a person such as beliefs and personalities. Automatic detection and recognition of tattoo images is a difficult problem. We present deep convolutional neural network-based methods for automatic matching of tattoo images based on the AlexNet and Siamese networks. Furthermore, we show that rather than using a simple contrastive loss function, triplet loss function can significantly improve the performance of a tattoo matching system. Extensive experiments on a recently introduced Tatt-C dataset show that our method is able to capture the meaningful structure of tattoos and performs significantly better than many competitive tattoo recognition algorithms.
The task of Heterogeneous Face recognition consists inmatch face images that were sensed in different modalities, such as sketches to photographs, thermal images to photographs or near infrared to photographs. In this...
详细信息
ISBN:
(纸本)9781509014378
The task of Heterogeneous Face recognition consists inmatch face images that were sensed in different modalities, such as sketches to photographs, thermal images to photographs or near infrared to photographs. In this preliminary work we introduce a novel and generic approach based on Inter-session Variability Modelling to handle this task. The experimental evaluation conducted with two different image modalities showed an average rank-1 identification rates of 96.93% and 72.39% for the CUHK-CUFS (Sketches) and CASIA NIR-VIS 2.0 (Near infra-red) respectively. This work is totally reproducible and all the source code for this approach is made publicly available.
Perceiving distance from two camera images, a task called stereo vision, is fundamental for many applications in robotics or automation. However, algorithms that compute this information at high accuracy have a high c...
详细信息
ISBN:
(纸本)9781509014378
Perceiving distance from two camera images, a task called stereo vision, is fundamental for many applications in robotics or automation. However, algorithms that compute this information at high accuracy have a high computational complexity. One such algorithm, Semi Global Matching (SGM), performs well in many stereo vision benchmarks, while maintaining a manageable computational complexity. Nevertheless, CPU and GPU implementations of this algorithm often fail to achieve real-time processing of camera images, especially in power-constrained embedded environments. This work presents a novel architecture to calculate disparities through SGM. The proposed architecture is highly scalable and applicable for low-power embedded as well as high-performance multi-camera high-resolution applications.
In this paper, we present a distributed embedded vision system that enables surround scene analysis and vehicle threat estimation. The proposed system analyzes the surroundings of the ego-vehicle using four cameras, e...
详细信息
ISBN:
(纸本)9781509014378
In this paper, we present a distributed embedded vision system that enables surround scene analysis and vehicle threat estimation. The proposed system analyzes the surroundings of the ego-vehicle using four cameras, each connected to a separate embedded processor. Each processor runs a set of optimized vision-based techniques to detect surrounding vehicles, so that the entire system operates at real-time speeds. This setup has been demonstrated on multiple vehicle testbeds with high levels of robustness under real-world driving conditions and is scalable to additional cameras. Finally, we present a detailed evaluation which shows over 95% accuracy and operation at nearly 15 frames per second.
Face recognition (FR) is the most preferred mode for biometric-based surveillance, due to its passive nature of detecting subjects, amongst all different types of biometric traits. FR under surveillance scenario does ...
详细信息
ISBN:
(纸本)9781509014378
Face recognition (FR) is the most preferred mode for biometric-based surveillance, due to its passive nature of detecting subjects, amongst all different types of biometric traits. FR under surveillance scenario does not give satisfactory performance due to low contrast, noise and poor illumination conditions on probes, as compared to the training samples. A state-of-the-art technology, Deep Learning, even fails to perform well in these scenarios. We propose a novel soft-margin based learning method for multiple feature-kernel combinations, followed by feature transformed using Domain Adaptation, which outperforms many recent state-of-the-art techniques, when tested using three real-world surveillance face datasets.
This work presents an occlusion aware hand tracker to reliably track both hands of a person using a monocular RGB camera. To demonstrate its robustness, we evaluate the tracker on a challenging, occlusion-ridden natur...
详细信息
ISBN:
(纸本)9781509014378
This work presents an occlusion aware hand tracker to reliably track both hands of a person using a monocular RGB camera. To demonstrate its robustness, we evaluate the tracker on a challenging, occlusion-ridden naturalistic driving dataset, where hand motions of a driver are to be captured reliably. The proposed framework additionally encodes and learns tracklets corresponding to complex (yet frequently occurring) hand interactions offline, and makes an informed choice during data association. This provides positional information of the left and right hands with no intrusion (through complete or partial occlusions) over long, unconstrained video sequences in an online manner. The tracks thus obtained may find use in domains such as human activity analysis, gesture recognition, and higher-level semantic categorization.
Heterogeneous face recognition is the problem of identifying a person from a face image acquired with a non-traditional sensor by matching it to a visible gallery. Most approaches to this problem involve modeling the ...
详细信息
ISBN:
(纸本)9781509014378
Heterogeneous face recognition is the problem of identifying a person from a face image acquired with a non-traditional sensor by matching it to a visible gallery. Most approaches to this problem involve modeling the relationship between corresponding images from the visible and sensing domains. This is typically done at the patch level and/or with shallow models with the aim to prevent over-fitting. In this work, rather than modeling local patches or using a simple model, we propose to use a complex, deep model to learn the relationship between the entirety of cross-modal face images. We describe a deep convolutional neural network based method that leverages a large visible image face dataset to prevent overfitting. We present experimental results on two benchmark datasets showing its effectiveness.
Improved dense trajectory features have been successfully used in video-based action recognition problems, but their application to face processing is more challenging. In this paper, we propose a novel system that de...
详细信息
ISBN:
(纸本)9781509014378
Improved dense trajectory features have been successfully used in video-based action recognition problems, but their application to face processing is more challenging. In this paper, we propose a novel system that deals with the problem of emotion recognition in real-world videos, using improved dense trajectory, LGBP-TOP, and geometric features. In the proposed system, we detect the face and facial landmarks from each frame of a video using a combination of two recent approaches, and register faces by means of Procrustes analysis. The improved dense trajectory and geometric features are encoded using Fisher vectors and classification is achieved by extreme learning machines. We evaluate our method on the extended Cohn-Kanade (CK+) and EmotiW 2015 Challenge databases. We obtain state-of-the-art results in both databases.
In this work, we consider the problem of recognition of object manipulation actions. This is a challenging task for real everyday actions, as the same object can be grasped and moved in different ways depending on its...
详细信息
ISBN:
(纸本)9781509014378
In this work, we consider the problem of recognition of object manipulation actions. This is a challenging task for real everyday actions, as the same object can be grasped and moved in different ways depending on its functions and geometric constraints of the task. We propose to leverage grasp and motion-constraints information, using a suitable representation, to recognize and understand action intention with different objects. We also provide an extensive experimental evaluation on the recent Yale Human Grasping dataset consisting of large set of 455 manipulation actions. The evaluation involves a) Different contemporary multi-class classifiers, and binary classifiers with one-vs-one multi-class voting scheme, and b) Differential comparisons results based on subsets of attributes involving information of grasp and motion-constraints. Our results clearly demonstrate the usefulness of grasp characteristics and motion-constraints, to understand actions intended with an object.
暂无评论