The battery is one of the most important energy sources. The most popular types of battery for energy storage are lithium batteries. However, due to its high energy storage value and long service life, there are also ...
详细信息
This systematic literature review explores the application of transformer models in early detection of human depression, encompassing text, audio, and video data modalities. Transformer architectures, notably BERT for...
This systematic literature review explores the application of transformer models in early detection of human depression, encompassing text, audio, and video data modalities. Transformer architectures, notably BERT for text, have proven adept at capturing crucial contextual and linguistic patterns associated with depression. For audio and video data, hybrid approaches that combine transformer models with other architectures are prevalent. Key features considered include eye gaze, head pose, facial muscle movements, and audio characteristics such as MFCC and Log-mel Spectrogram, along with text embeddings. Performance comparisons underscore the superiority of text-based data in consistently delivering the most promising results, followed by audio and video modalities when utilizing transformer models. The fusion of multiple modalities emerges as an effective strategy for enhancing predictive accuracy, with the amalgamation of audio, video, and text data yielding the most precise outcomes. However, it is noteworthy that unimodal approaches also exhibit potential, with text data exhibiting superior performance over audio and video data. Nevertheless, several challenges persist in this research domain, including imbalanced datasets, the limited availability of comprehensive and diverse samples, and the inherent complexities in interpreting visual cues. Addressing these challenges remains imperative for the continued advancement of depression detection using transformer-based models across various modalities.
Major Depressive Disorder (MDD) is a prevalent mental disorder, affecting a significant number of individuals, with estimates reaching 300 million cases worldwide. Currently, the diagnosis of this condition relies hea...
Major Depressive Disorder (MDD) is a prevalent mental disorder, affecting a significant number of individuals, with estimates reaching 300 million cases worldwide. Currently, the diagnosis of this condition relies heavily on subjective assessments based on the experience of medical professionals. Therefore, researchers have turned to deep learning models to explore the detection of depression. The objective of this review is to gather information on detecting depression based on facial expressions in videos using deep learning techniques. Overall, this research found that RNN models achieved 7.22 MAE for AVEC2014. LSTM models produced 4.83 MAE for DAIC-WOZ, while GRU models achieved an accuracy of 89.77% for DAIC-WOZ. Features like Facial Action Units (FAU), eye gaze, and landmarks show great potential and need to be further analyzed to improve results. Analysis can include applying feature engineering techniques. Aggregation methods, such as mean calculation, are recommended as effective approaches for data processing. This Systematic Literature Review found that facial expressions do have relevant patterns related to MDD.
Cervical cancer has been known as one of the most prevalent medical disorders globally and a leading cause of death. Early detection, particularly through Pap tests, plays a vital role in its prevention. Previous stud...
Cervical cancer has been known as one of the most prevalent medical disorders globally and a leading cause of death. Early detection, particularly through Pap tests, plays a vital role in its prevention. Previous studies have leveraged machine learning and deep learning techniques to classify the medical images obtained from Pap tests. In this study, a Systematic Literature Review methodology was used to examine 15 relevant papers that have been filtered from queries to Google Scholar which have gone through 4 stages of filtering that include: identification, screening, eligibility, and inclusion. This study addresses two research questions regarding the datasets and deep learning techniques for classifying pap smear images in recent years. The performance of the models was analyzed and potential areas for improvements are suggested. The findings of this study reveal that the Herlev University Hospital and SIPaKMed datasets are the most used. The methodologies used by researchers range from machine learning techniques, transfer learning using Convolutional Neural Networks, and utilize state-of-the-art models with novel optimizing methodology. While there are exciting opportunities in the field, challenges include model generalization and interpretability.
The ability of Convolutional Neural Networks (CNNs) to accurately discriminate between normal and tumorous brain tissues has been promising. The review focuses on the different CNN models, pre-processing methods, data...
The ability of Convolutional Neural Networks (CNNs) to accurately discriminate between normal and tumorous brain tissues has been promising. The review focuses on the different CNN models, pre-processing methods, data augmentation, and Transfer Learning (TL) strategies used in this research. This Systematic Literature Review (SLR) collected the data from Google Scholar. The results of this study indicate that open-source datasets from Kaggle and Brain MRI Images for Brain Tumor Detection are the most used datasets. However, limited data and imbalanced class problems remain common challenges across various datasets. To overcome those challenges, using a larger dataset, oversampling, Generative Adversarial Network (GAN), federated learning, and Self-Supervised Learning (SSL) to handle the imbalance are the potential solution. Additionally, popular CNN architectures for brain tumor classification extensively use pre-trained models such as VGG16, VGG19, DenseNet121, DenseNet201, GoogleNet, ResNet-50, and Inception-v3. TL strategies are preferred, allowing CNNs to leverage knowledge from large datasets, improving generalization even with limited labeled data.
Incremental dataflow analysis is a conventional technique adopted in syntax-directed editors, popularly used in Integrated Development Environments (IDEs). However, dataflow anomaly detection during program editing in...
详细信息
Education stands out as one of the most impactful applications of the metaverse, holding immense potential for the future. Within the realm of satellite communication system science education, the integration of immer...
详细信息
Access point (AP) security has become increasingly important as wireless local area networks (WLANs) proliferate in industrial environments. Rogue APs are often used by attackers to conduct man-in-the-middle (MiTM) at...
详细信息
ISBN:
(数字)9781665464543
ISBN:
(纸本)9781665464550
Access point (AP) security has become increasingly important as wireless local area networks (WLANs) proliferate in industrial environments. Rogue APs are often used by attackers to conduct man-in-the-middle (MiTM) attacks. They can redirect users to malicious servers or do eavesdropping and manipulation of their *** this paper, we propose a novel one-class machine learning model to passively identify rogue APs in industrial environments. The implementation of the model is twofold. First, we passively extract the hardware and software characteristics of the evaluated AP according to its generated messages. This results in a comprehensive feature set that captures both low-level and high-level behaviors of the evaluated ***, we apply a one-class machine learning model to identify APs that significantly deviate from the previously known profile of legitimate APs. The combined evaluation of hardware and software behaviors integrated with an outlier detection scheme to effectively identify rogue APs is the insight of our proposal. We demonstrate the feasibility of our model, achieving an F1 score of 0.89 and a true positive rate of 0.9 in experiments conducted on our new publicly available dataset of 357 unique AP behaviors.
Learning to assemble geometric shapes into a larger target structure is a pivotal task in various practical applications. In this work, we tackle this problem by establishing local correspondences between point clouds...
详细信息
Learning to assemble geometric shapes into a larger target structure is a pivotal task in various practical applications. In this work, we tackle this problem by establishing local correspondences between point clouds of part shapes in both coarse- and fine-levels. To this end, we introduce Proxy Match Transform (PMT), an approximate high-order feature transform layer that enables reliable matching between mating surfaces of parts while incurring low costs in memory and compute. Building upon PMT, we introduce a new framework, dubbed Proxy Match TransformeR (PMTR), for the geometric assembly task. We evaluate the proposed PMTR on the large-scale 3D geometric shape assembly benchmark dataset of Breaking Bad and demonstrate its superior performance and efficiency compared to state-of-the-art methods. Project page: https://***/pmtr. Copyright 2024 by the author(s)
Palm vein pattern recognition offers a unique personal identification feature. Unfortunately, these techniques typically require a Near Infrared (NIR) camera sensor to extract the individual's venous pattern, chal...
详细信息
ISBN:
(数字)9798350374889
ISBN:
(纸本)9798350374896
Palm vein pattern recognition offers a unique personal identification feature. Unfortunately, these techniques typically require a Near Infrared (NIR) camera sensor to extract the individual's venous pattern, challenging their wide deployment. This paper proposes a new feasible palm vein verification scheme using a Deep Autoencoder and a Siamese Network, implemented threefold. First, we capture the individual's palm using a traditional visible spectrum camera sensor and perform preprocessing tasks to correct imprecise positioning, easing palm support accessories requirements. Second, we eliminate NIR sensor requirement by fine-tuning a Deep Autoencoder model to convert images from the visible spectrum to their infrared counterparts. Third, generated images are processed by a lightweight Siamese network using a contrastive loss function for individual verification. Experiments conducted on a publicly available dataset with over a hundred individuals confirmed the feasibility of our proposal. Our scheme reaches up to 0.97 of true-negative rate, with only 0.01 decrease compared to traditional NIR-based approaches. In addition, individual identification can be conducted in less than 6 seconds in a resource-constrained environment thanks to our lightweight model's implementation.
暂无评论