Body movements are an essential part of non-verbal communication as they help to express and interpret human emotions. The potential of Body Emotion Recognition (BER) is immense, as it can provide insights into user p...
详细信息
ISBN:
(纸本)9783031667428;9783031667435
Body movements are an essential part of non-verbal communication as they help to express and interpret human emotions. The potential of Body Emotion Recognition (BER) is immense, as it can provide insights into user preferences, automate real-time exchanges and enable machines to respond to human emotions. BER finds applications in customer service, healthcare, entertainment, emotion-aware robots, and other areas. While face expression-based techniques are extensively researched, detecting emotions from body movements in the realworld presents several challenges, including variations in body posture, occlusions, and background. Recent research has established the efficacy of transformer deep-learning models beyond the language domain to solve video and image-related problems. A key component of transformers is the self-attention mechanism, which captures relationships among features across different spatial locations, allowing contextual information extraction. In this study, we aim to understand the role of body movements in emotion expression and to explore the use of transformer networks for body emotion recognition. Our method proposes a novel linear projection function of the visual transformer, which enables the transformation of 2D joint coordinates into a conventional matrix representation. Using an original method of contextual information learning, the developed approach enables a more accurate recognition of emotions by establishing unique correlations between individual's body motions over time. Our results demonstrated that the self-attention mechanism was able to achieve high accuracy in predicting emotions from body movements, surpassing the performance of other recent deep-learning methods. In addition, the impact of dataset size and frame rate on classification performance is analyzed.
In automatic feeding systems, feeding of characteristic workpieces by mechanical tools causes accuracy and cost difficulties. For this reason, in systems where special workpieces are fed, imageprocessingapplications...
详细信息
IntroductionEsophageal cancer (EC) is a significant global health problem, with an estimated 7th highest incidence and 6th highest mortality rate. Timely diagnosis and treatment are critical for improving patients'...
详细信息
IntroductionEsophageal cancer (EC) is a significant global health problem, with an estimated 7th highest incidence and 6th highest mortality rate. Timely diagnosis and treatment are critical for improving patients' outcomes, as over 40% of patients with EC are diagnosed after metastasis. Recent advances in machine learning (ML) techniques, particularly in computer vision, have demonstrated promising applications in medical imageprocessing, assisting clinicians in making more accurate and faster diagnostic decisions. Given the significance of early detection of EC, this systematic review aims to summarize and discuss the current state of research on ML-based methods for the early detection of *** conducted a comprehensive systematic search of five databases (PubMed, Scopus, Web of Science, Wiley, and IEEE) using search terms such as "ML", "Deep Learning (DL (", "Neural Networks (NN)", "Esophagus", "EC" and "Early Detection". After applying inclusion and exclusion criteria, 31 articles were retained for full *** results of this review highlight the potential of ML-based methods in the early detection of EC. The average accuracy of the reviewed methods in the analysis of endoscopic and computed tomography (CT (images of the esophagus was over 89%, indicating a high impact on early detection of EC. Additionally, the highest percentage of clinical images used in the early detection of EC with the use of ML was related to white light imaging (WLI) images. Among all ML techniques, methods based on convolutional neural networks (CNN) achieved higher accuracy and sensitivity in the early detection of EC compared to other *** findings suggest that ML methods may improve accuracy in the early detection of EC, potentially supporting radiologists, endoscopists, and pathologists in diagnosis and treatment planning. However, the current literature is limited, and more studies are needed to investigate the clinical applications of these met
On the battlefield, early detection of armored vehicles can have a positive effect. Because according to this issue, timely and appropriate reactions can be done. The purpose of this study is to achieve the required a...
详细信息
On the battlefield, early detection of armored vehicles can have a positive effect. Because according to this issue, timely and appropriate reactions can be done. The purpose of this study is to achieve the required algorithm in the vehicle control system by considering the car sensor vision, which is necessary to identify and determine the equipment needed to control the military drone based on car sensor vision. Today, the use of wireless networks, especially inter-vehicle wireless networks, in military applications is inevitable. Therefore, in the first step of this research, a new method has been proposed to control and steer unmanned vehicles based on car vision. In the proposed method, two 180-degree panoramic cameras with horizontal vision are used from the recorded images. The simulation results of the proposed method show increased accuracy and reduced implementation cost compared to using LIDAR and RADAR technologies. In the second step, a new approach is introduced to identify four common classes of armored vehicles (tanks, personnel carriers, firing tanks, and military vehicles) that are more likely to be present on battlefields. For this purpose, the latest imageprocessing methods, which is deep learning, have been used. The results of the simulation of the proposed approach show the high accuracy of the proposed approach in detecting armored vehicles in a short time. In the third step of this research, a new method has been proposed to increase the connection of wireless networks. In the proposed method, queue theory is used and the results of the simulation of the proposed method show the high efficiency of the method. As a result, accurate and fast detection with unique features makes the users of the system superior.
The non-trivial task of designing an automatic picture content recognition system has been researched for several applications, including human identification, face detection, and face recognition. Digital image proce...
详细信息
video frame prediction represents a fundamental challenge in computer vision, necessitating precise modeling of both spatial and temporal dynamics within video sequences. This computational task holds substantial impl...
详细信息
video frame prediction represents a fundamental challenge in computer vision, necessitating precise modeling of both spatial and temporal dynamics within video sequences. This computational task holds substantial implications across diverse domains, including video compression optimization, robust object tracking systems, and advanced motion forecasting applications. In this investigation, we present a novel hybrid architecture that synthesizes the complementary strengths of Convolutional Long Short-Term Memory (ConvLSTM) networks and three-dimensional Convolutional Neural Networks (3D CNN) for enhanced frame prediction capabilities. Our methodological framework incorporates a ConvLSTM component that fundamentally augments the traditional LSTM architecture through the integration of convolutional operations, thereby facilitating sophisticated modeling of sequential dependencies. Concurrently, the 3D CNN component employs volumetric convolutional layers to extract rich spatio-temporal features from the input sequences. Rigorous empirical evaluation demonstrates the superior performance of the ConvLSTM architecture, which consistently yields reduced validation errors and elevated coefficients of determination. Specifically, the ConvLSTM model achieves a validation Mean Squared Error (MSE) of 0.0237 and an R2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\textrm{R}}<^>{2}$$\end{document} value of 0.6951, substantially outperforming the 3D CNN model, which exhibits a validation MSE of 0.0471 and an R2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\textrm{R}}<^>{2}$$\end{document} value of 0.3939. These empiri
Automatic crack detection is a challenging task that has been researched for decades due to the complex civil structures. Cracks on any structure are early signs of the deterioration of the object's surface. There...
详细信息
Automatic crack detection is a challenging task that has been researched for decades due to the complex civil structures. Cracks on any structure are early signs of the deterioration of the object's surface. Therefore, detection and regular maintenance of cracks are necessary tasks as the propagation of cracks results in severe damage. Manual inspection is based on the expert's previous knowledge, and it can only be done in reachable human areas. On the other hand, autonomous detection of cracks by using image-based techniques may reduce human errors, less time-consuming, and more economical than human-based inspection for real-time crack detection. Since movable cameras can capture images for non-reachable areas, several techniques are available for crack detection. Several techniques are available for crack detection;however, image-based crack detection techniques have been analyzed in this survey. A detailed study is carried out to define the research problems and advancements in this area. This article analyses the pure imageprocessing techniques and learning-based techniques based on the objectives, the methods, level of efficiency, level of errors, and type of crack image dataset. Besides the applications, limitations and other factors are explained for each technique. Moreover, the presented analysis shows the multiple problems related to cracks that could help the researcher perform further research.
Style transformation on face images has traditionally been a popular research area in the field of computer vision, and its applications are quite extensive. Currently, the more mainstream schemes include Generative A...
详细信息
Artificial Intelligence Generated Content (AIGC) has experienced significant advancements, particularly in the areas of natural language processing and 2D image generation. However, the generation of three-dimensional...
详细信息
ISBN:
(纸本)9789819785070;9789819785087
Artificial Intelligence Generated Content (AIGC) has experienced significant advancements, particularly in the areas of natural language processing and 2D image generation. However, the generation of three-dimensional (3D) content from a single image still poses challenges, particularly when the input image contains complex backgrounds. This limitation hinders the potential applications of AIGC in areas such as human-machine interaction, virtual reality (vR), and architectural design. Despite the progress made so far, existing methods face difficulties when dealing with single images that have intricate backgrounds. Their reconstructed 3D shapes tend to be incomplete, noisy, or lack of partial geometric structures. In this paper, we introduce a 3D generation framework for indoor scenes from a single image to generate realistic and visually-pleasing 3D geometry shapes, without the requirement of point clouds, multi-view images, depth or masks as input. The main idea of our method is clustering-based 3D shape learning and prediction, followed by a shape deformation. Since more than one objects tend to be existing in indoor scenes, our framework will simultaneously generate multi-objects and predict the layout with a camera pose, as well as 3D object bounding boxes for holistic 3D scene understanding. We have evaluated the proposed framework on benchmark datasets including ShapeNet, SUN RGB-D and Pix3D, and state-of-the-art performance has been achieved. We have also given examples to illustrate immediate applications in virtual reality.
In the rapidly evolving landscape of computer vision and artificial intelligence, transfer learning has emerged as a powerful tool for efficiently applying pre-trained models to new tasks. This article delves into the...
详细信息
暂无评论