The facial expression analysis requires a compact and identity-ignored expression representation. In this paper, we model the expression as the deviation from the identity by a subtraction operation, extracting a cont...
详细信息
ISBN:
(纸本)9781665445092
The facial expression analysis requires a compact and identity-ignored expression representation. In this paper, we model the expression as the deviation from the identity by a subtraction operation, extracting a continuous and identity-invariant expression embedding. We propose a Deviation Learning Network (DLN) with a pseudo-siamese structure to extract the deviation feature vector. To reduce the optimization difficulty caused by additional fully connection layers, DLN directly provides high-order polynomial to nonlinearly project the high-dimensional feature to a low-dimensional manifold. Taking label noise into account, we add a crowd layer to DLN for robust embedding extraction. Also, to achieve a more compact representation, we use hierarchical annotation for data augmentation. We evaluate our facial expression embedding on the FEC validation set. The quantitative results prove that we achieve the state-of-the-art, both in terms of fine-grained and identity-invariant property. We further conduct extensive experiments to show that our expression embedding is of high quality for expression recognition, image retrieval, and face manipulation.
Millions of deaf and hard-of-hearing people globally rely on sign language as the fundamental tool for communication. However, a huge communication gap exists between the sign language users and the larger community w...
详细信息
The gesture classification and recognition field offer ample research opportunities due to the growing deaf and hearing-impaired populations and the advancements in vision-based devices. As recognition of hand gesture...
详细信息
An approach to do real-time monitoring of Yoga Asanas using Deep Learning and computervision approaches. Convolutional neural networks (CNN) and long short-term memory (LSTM) are combined to create a hybrid deep lear...
详细信息
Artificial intelligence and computer science's computervision field is revolutionizing a number of industries, including healthcare, automotive, agriculture, security, and entertainment, by enabling robots to ass...
详细信息
Learning-based image compression has drawn increasing attention in recent years. Despite impressive progress has been made, it still lacks a universal encoder optimization method to seek efficient representation for d...
详细信息
ISBN:
(纸本)9781665448994
Learning-based image compression has drawn increasing attention in recent years. Despite impressive progress has been made, it still lacks a universal encoder optimization method to seek efficient representation for different images. In this paper, we develop a universal rate distortion optimization framework for learning-based compression, which adaptively optimizes latents and side information together for each image. The proposed framework is independent of network architecture and can be flexibly applied to existing and potential future compression networks. Experimental results demonstrate that we can achieve 6.6% bit rate saving against the latest traditional codec, i.e., VVC, yielding the state-of-the-art compression ratio. Moreover, with the proposed optimization framework, we win the first place in CLIC validation phase for all the three different bit rates in terms of PSNR.
Vehicle Re-Identification (Re-ID) aims to identify the same vehicle across different cameras, hence plays an important role in modern traffic management systems. The technical challenges require the algorithms must be...
详细信息
ISBN:
(纸本)9781665448994
Vehicle Re-Identification (Re-ID) aims to identify the same vehicle across different cameras, hence plays an important role in modern traffic management systems. The technical challenges require the algorithms must be robust in different views, resolution, occlusion and illumination conditions. In this paper, we first analyze the main factors hindering the Vehicle Re-ID performance. We then present our solutions, specifically targeting the dataset Track 2 of the 5th AI City Challenge, including (1) reducing the domain gap between real and synthetic data, (2) network modification by stacking multi heads with attention mechanism, (3) adaptive loss weight adjustment. Our method achieves 61.34% mAP on the private CityFlow testset without using external dataset or pseudo labeling, and outperforms all previous works at 87.1% mAP on the Veri benchmark. The code is available at https://***/cybercore-co-ltd/track2_aicity_2021.
This research introduces a facial recognition-based attendance system that leverages an open-source computervision library and integrates with a real-time database system. The system comprises essential components, i...
详细信息
In sign languages, where communication is achieved through hand gestures, facial expressions and body language, signs are the subject of many studies due to the diversity in terms of the position of different body par...
详细信息
ISBN:
(纸本)9798350388978;9798350388961
In sign languages, where communication is achieved through hand gestures, facial expressions and body language, signs are the subject of many studies due to the diversity in terms of the position of different body parts. These diversities are also encountered in emotion detection in Turkish Sign Language (TID), making direct translation of hand gestures inadequate for emotion detection. Accordingly, in this study, for the first time in the literature, sentiment analysis in TID was performed using facial expressions and hand gestures. For this purpose, a specialized model for the tasks of emotion extraction from facial expressions and gesture recognition from hand gestures was fine-tuned with the dataset collected in this study. As a result, facial expressions are found to be more significant than hand gestures in sentiment analysis in TID, but when supported with hand gestures, the performance improved even more.
From a non-central panorama, 3D lines can be recovered by geometric reasoning. However, their sensitivity to noise and the complex geometric modeling required has led these panoramas being very little investigated. In...
详细信息
ISBN:
(纸本)9781665448994
From a non-central panorama, 3D lines can be recovered by geometric reasoning. However, their sensitivity to noise and the complex geometric modeling required has led these panoramas being very little investigated. In this work we present a novel approach for 3D layout recovery of indoor environments using single non-central panoramas. We obtain the boundaries of the structural lines of the room from a non-central panorama using deep learning and exploit the properties of non-central projection systems in a new geometrical processing to recover the scaled layout. We solve the problem for Manhattan environments, handling occlusions, and also for Atlanta environments in an unified method. The experiments performed improve the state-of-the-art methods for 3D layout recovery from a single panorama. Our approach is the first work using deep learning with non-central panoramas and recovering the scale of single panorama layouts.
暂无评论