There has emerged a growing interest in exploring efficient quality assessment algorithms for image super-resolution (SR). However, employing deep learning techniques, especially dual-branch algorithms, to automatical...
详细信息
Facial reactions convey crucial emotional information and coordinating interpersonal relationships in human dyadic interactions. While existing Multiple Appropriate Facial Reaction Generation (MAFRG) methods focus on ...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
Facial reactions convey crucial emotional information and coordinating interpersonal relationships in human dyadic interactions. While existing Multiple Appropriate Facial Reaction Generation (MAFRG) methods focus on generating multiple reasonable facial reactions, none of these approaches combines 2D and 3D facial behaviour information nor account for the influence of individuals’ facial identities, leading to inconsistencies in the generated facial reactions and limited capability in capturing subtle variations in facial depth and expression dynamics. This paper proposes a novel Hierarchical Multimodal Decoupling-Fusion (HMDF) framework that decouples 3D facial identity from expression behaviors, eliminating identity-based interference in the reaction generation process, which are integrated with audio-visual features through a cross-attention mechanism. Experiments show that our framework achieved the enhanced diversity and synchrony in the generated facial reactions.
In this multi-task learning study on simultaneous analysis of emotions and their underlying causes in conversational contexts, deep neural network methods were employed to effectively process and train large labeled d...
详细信息
Enclosed spaces are places with a higher potential for spreading the COVID-19 virus. This is because the COVID-19 virus can be carried in the air. Closed space makes the air last longer in the space. Moreover, closed ...
详细信息
Enclosed spaces are places with a higher potential for spreading the COVID-19 virus. This is because the COVID-19 virus can be carried in the air. Closed space makes the air last longer in the space. Moreover, closed spaces are widely used, such as homes, schools, malls, offices, places of worship, etc. So for closed spaces, serious attention must be given to avoid the spread of the virus. ADX is an IoT-based tool equipped with UVC rays that can kill viruses, including the COVID-19 virus. This tool can be controlled remotely, manually, or on a timer. So that the device can be activated first before the closed space is used. This study also used the ESP8266 microcontroller in supporting the development of IoT-based ADX tools. The results of this study with an accuracy of the IoT ADX system of 96.10% and the average response time is 1.47 seconds.
In the field of single object tracking, conventional methods often rely on correlation filters or visual image processing. However, these approaches typically focus solely on extracting target object features and cons...
In the field of single object tracking, conventional methods often rely on correlation filters or visual image processing. However, these approaches typically focus solely on extracting target object features and consider only the position information in the current frame. They lack integration of contextual and spatial-temporal information, which can limit tracking *** method takes inspiration from single object detection and tracking techniques. It combines efficient model-based object detection with the spatial position of the final target in the image. This approach offers significant advantages in reducing tracking drift (deviation from the target object) and improving tracking accuracy (predicting bounding boxes that closely follow the target).To extract spatial information, we utilize a three-layer ResNet and Convolutional Block Attention Module (CBAM) in conjunction with a siamese network. This combination effectively captures the spatial characteristics of the target object. Additionally, we employ an adaptive processing head with an internal structure of Long Short Term Memory (LSTM) to capture temporal information. By integrating spatial and temporal cues, our method achieves more robust and accurate *** comprehensive comparison and ablation experiments across multiple datasets, we have demonstrated notable improvements with our method. It tightly connects the predicted bounding box with the tracking target and effectively combines spatial and temporal information for precise object tracking.
Human beings cooperatively navigate rule-constrained environments by adhering to mutually known navigational patterns, which may be represented as directional pathways or road lanes. Inferring these navigational patte...
详细信息
Among various technologies being applied for indoor localization, WiFi has become a common source of information to determine the pedestrian’s position due to the widespread of WiFi access points in indoor environmen...
详细信息
ERSOW robot soccer that participated in the Indonesian Wheeled Robot Soccer Contest, has many abilities such as object detection and classification, control and navigation system, self-localization and mapping, and al...
详细信息
This paper presents the use of the Adaptive Boosting and Cascade Classifier based method to detect someone wearing a mask or not on various facial poses. In general, masks are used to protect the nose and mouth to pre...
详细信息
Graph neural networks (GNNs) have achieved extraordinary enhancements in various areas including the fields medical imaging and network neuroscience where they displayed a high accuracy in diagnosing challenging neuro...
详细信息
暂无评论