In this paper, we introduce a novel tool for speech emotion recognition, CA-SER, that borrows self-supervised learning to extract semantic speech representations from a pre-trained wav2vec 2.0 model and combine them w...
详细信息
In this paper, we consider the problem of insufficient runtime and memory-space complexities of deep convolutional neural networks for visual emotion recognition. A survey of recent compression methods and efficient n...
详细信息
In this paper, we consider the problem of insufficient runtime and memory-space complexities of deep convolutional neural networks for visual emotion recognition. A survey of recent compression methods and efficient neural networks architectures is provided. We experimentally compare the computational speed and memory consumption during the training and the inference stages of such methods as the weights matrix decomposition, binarization and hashing. It is shown that the most efficient optimization can be achieved with the matrices decomposition and hashing. Finally, we explore the possibility to distill the knowledge from the large neural network, if only large unlabeled sample of facial images is available.
This article presents our results for the sixth Affective Behavior analysis in-the-wild (ABAW) competition. To improve the trustworthiness of facial analysis, we study the possibility of using pre-trained deep models ...
详细信息
This article investigates the application of machine learning techniques for predicting corporate default risk. In the credit scoring domain, the class imbalance problem is prevalent, with defaulted cases typically be...
详细信息
In this paper, we describe the results of the hsemotion team in two tasks of the seventh Affective Behavior analysis in-the-wild (ABAW) competition, namely, multi-task learning for simultaneous prediction of facial ex...
详细信息
This article presents our results for the sixth Affective Behavior analysis in-the-wild (ABAW) competition. To improve the trustworthiness of facial analysis, we study the possibility of using pre-trained deep models ...
详细信息
ISBN:
(数字)9798350365474
ISBN:
(纸本)9798350365481
This article presents our results for the sixth Affective Behavior analysis in-the-wild (ABAW) competition. To improve the trustworthiness of facial analysis, we study the possibility of using pre-trained deep models that extract reliable emotional features without the need to fine-tune the neural networks for a downstream task. In particular, we introduce several lightweight models based on MobileViT, MobileFaceNet, EfficientNet, and DDAMFN architectures trained in multi-task scenarios to recognize facial expressions, valence, and arousal on static photos. These neural networks extract frame-level features fed into a simple classifier, e.g., linear feed-forward neural network, to predict emotion intensity, compound expressions, and valence/arousal. Experimental results for three tasks from the sixth ABAW challenge demonstrate that our approach lets us significantly improve quality metrics on validation sets compared to existing non-ensemble techniques. As a result, our solutions took second place in the compound expression recognition competition.
In this article, the results of our team for the fifth Affective Behavior analysis in-the-wild (ABAW) competition are presented. The usage of the pre-trained convolutional networks from the EmotiEffNet family for fram...
详细信息
This paper deals with one of the problems of recognizing the emotion from a photo gathered from in-the-wild settings, namely, facial expression recognition. We study various ensemble approaches that combine the lightw...
详细信息
In this article, the pre-trained convolutional networks from the EmotiEffNet family for frame-level feature extraction are used for downstream emotion analysis tasks from the fifth Affective Behavior analysis in-the-w...
In this article, the pre-trained convolutional networks from the EmotiEffNet family for frame-level feature extraction are used for downstream emotion analysis tasks from the fifth Affective Behavior analysis in-the-wild (ABAW) competition. In particular, we propose an ensemble of a multi-layered perceptron and the LightAutoML-based classifier. The post-processing by smoothing the results for sequential frames is implemented. Experimental results for the large-scale Aff-Wild2 database demonstrate that our model is much better than the baseline facial processing using VGGFace And ResNet. For example, our macro-averaged F1-scores of facial expression recognition and action unit detection on the testing set are 11-13% greater. Moreover, the concordance correlation coefficients for valence/arousal estimation are up to 30% higher when compared to the baseline.
暂无评论