This paper describes a new method for face recognition under drastic changes of the imaging processes through which the facial images are acquired. Unlike the conventional methods that use only the face features, the ...
详细信息
ISBN:
(纸本)0818684976
This paper describes a new method for face recognition under drastic changes of the imaging processes through which the facial images are acquired. Unlike the conventional methods that use only the face features, the present method exploits the statistical information of the variations between the face image sets being compared, in addition to the features of the faces themselves. To incorporate both of the face and perturbation features for recognition, we develop a technique called weak orthogonalization of the two subspaces th at transforms the given two overlapped subspaces so that the volume of the intersection of the resulting two subspaces is minimized. Matching operations are performed in the transformed face space that has thus been weakly orthogonalized against perturbation space. Experimental results on real pictures of the frontal faces from drivers' licenses show that the new algorithm improves the recognition performance over the conventional methods. We also demonstrate the effectiveness of our method on image sets with changes in viewing geometry.
Material recognition is researched in both computervision and vision science fields. In this paper, we investigated how humans observe material images and found the eye fixation information improves the performance o...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
Material recognition is researched in both computervision and vision science fields. In this paper, we investigated how humans observe material images and found the eye fixation information improves the performance of material image classification models. We first collected eye-tracking data from human observers and used it to fine-tune a generative adversarial network for saliency prediction (SalGAN). We then fused the predicted saliency map with material images and fed them to CNN models for material classification. The experiment results show that the classification accuracy is improved than those using original images. This indicates that human's visual cues could benefit computational models as priors.
Understanding the complex relationship between emotions and facial expressions is important for both psychologists and computer scientists. A large body of research in psychology investigates facial expressions, emoti...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
Understanding the complex relationship between emotions and facial expressions is important for both psychologists and computer scientists. A large body of research in psychology investigates facial expressions, emotions, and how emotions are perceived from facial expressions. As computer scientists look to incorporate this research into automatic emotion perception systems, it is important to understand the nature and limitations of human emotion perception. These principles of emotion science affect the way datasets are created, methods are implemented, and results are interpreted in automated emotion perception. This paper aims to distill and align prior work in automated and human facial emotion perception to facilitate future discussions and research at the intersection of the two disciplines.
A new patternrecognition approach to face recognition is presented that can deal with drastic differences in the appearance of a face. Given a pair of sample sets of facial images with potential correspondences, each...
详细信息
ISBN:
(纸本)0769506623
A new patternrecognition approach to face recognition is presented that can deal with drastic differences in the appearance of a face. Given a pair of sample sets of facial images with potential correspondences, each being drawn from a distinctive distribution, the algorithm reliably finds correspondences over those different distributions. Unlike the traditional approaches that model the face images as having a consistent distribution and so use the same feature extraction function for both of the image sets, the new method applies to each sample a function specific to the distribution from which ii is drawn. This function is derived by maximizing a newly defined class-separability criterion over the different distributions. Results efface recognition are presented on images including drivers' license pictures. Drastic improvements are shown over algorithms based on the traditional Fisher's discriminant analysis.
A given (overcomplete) discrete oriented pyramid may be converted into a steerable pyramid by interpolation. We present a technique for deriving the optimal interpolation functions (otherwise called steering coefficie...
详细信息
ISBN:
(纸本)0818658258
A given (overcomplete) discrete oriented pyramid may be converted into a steerable pyramid by interpolation. We present a technique for deriving the optimal interpolation functions (otherwise called steering coefficients). The proposed scheme is demonstrated on a computationally efficient oriented pyramid, which is a variation on the Burt and Adelson pyramid. We apply the generated steerable pyramid to orientation-invariant texture analysis to demonstrate its excellent rotational isotropy. High classification rates and precise rotation identification are demonstrated.
Touchless hand gesture recognition systems are becoming important in automotive user interfaces as they improve safety and comfort. Various computervision algorithms have employed color and depth cameras for hand ges...
详细信息
ISBN:
(纸本)9781467367592
Touchless hand gesture recognition systems are becoming important in automotive user interfaces as they improve safety and comfort. Various computervision algorithms have employed color and depth cameras for hand gesture recognition, but robust classification of gestures from different subjects performed under widely varying lighting conditions is still challenging. We propose an algorithm for drivers' hand gesture recognition from challenging depth and intensity data using 3D convolutional neural networks. Our solution combines information from multiple spatial scales for the final prediction. It also employs spatiotemporal data augmentation for more effective training and to reduce potential overfitting. Our method achieves a correct classification rate of 77.5% on the VIVA challenge dataset.
We present a vision-based method for signer diarization - the task of automatically determining "who signed when?" in a video. This task has similar motivations and applications as speaker diarization but ha...
详细信息
ISBN:
(纸本)9780769549903
We present a vision-based method for signer diarization - the task of automatically determining "who signed when?" in a video. This task has similar motivations and applications as speaker diarization but has received little attention in the literature. In this paper, we motivate the problem and propose a method for solving it. The method is based on the hypothesis that signers make more movements than their interlocutors. Experiments on four videos (a total of 1.4 hours and each consisting of two signers) show the applicability of the method. The best diarization error rate (DER) obtained is 0.16.
This paper deals with the recovery of 3D information using a single mobile camera in the context of active vision. We propose a general revisited formulation of the structure-from-motion issue, and we determine adequa...
详细信息
ISBN:
(纸本)0818658258
This paper deals with the recovery of 3D information using a single mobile camera in the context of active vision. We propose a general revisited formulation of the structure-from-motion issue, and we determine adequate camera configurations and motions which lead to a robust and accurate estimation of the 3D structure parameters. We apply the visual servoing approach to perform these camera motions. Real-time experiments dealing with the 3D structure estimation of points and cylinders are reported, and demonstrate that this active vision strategy can very significantly improve the estimation accuracy.
In this paper we present a flash game that aims at generating easily ground truth for testing object detection algorithms. Flash the Fish is an online game where the user is shown videos from underwater environments a...
详细信息
ISBN:
(纸本)9780769549903
In this paper we present a flash game that aims at generating easily ground truth for testing object detection algorithms. Flash the Fish is an online game where the user is shown videos from underwater environments and has to take photos of fish by clicking on them. The initial ground truth is provided by object detection algorithms and, subsequent, cluster analysis and user evaluation techniques, allow for the generation of ground truth based on the weighted combination of these "photos". Evaluation of the platform and comparison of the obtained results against a hand drawn ground truth confirmed that reliable ground truth generation is not necessarily a cumbersome task both in terms of effort and time needed.
暂无评论