With the emergence of AI for good, there has been an increasing interest in building computer vision data-driven deep learning inclusive AI solutions. Sign language Recognition (SLR) has gained attention recently. It ...
详细信息
With the emergence of AI for good, there has been an increasing interest in building computer vision data-driven deep learning inclusive AI solutions. Sign language Recognition (SLR) has gained attention recently. It is an essential component of a sign-to-text translation system to support the deaf and hard-of-hearing population. This paper presents a computer VISIOn data-driven deep learning framework for Sign Language video Recognition (VisoSLR). VisioSLR provides a precise measurement of translating signs for developing an end-to-end computational translation system. Considering the scarcity of sign language datasets, which hinders the development of an accurate recognition model, we evaluate the performance of our framework by fine-tuning the very well-known YOLO models, which are built from a signs-unrelated collection of images and videos, using a small-sized sign language dataset. Gathering a sign language dataset for signs training would involve an enormous amount of time to collect and annotate videos in different environmental setups and multiple signers, in addition to the training time of a model. Numerical evaluations of VisioSLR show that our framework recognizes signs with a mean average precision of 97.4%, 97.1%, and 95.5% and 11, 12, and 12 milliseconds of recognition time on YOLOv8m, YOLOv9m, and YOLOv11m, respectively.
作者:
Munir, AdnanSiddiqui, Abdul JabbarHossain, M. ShamimEl-Maleh, Aiman
Computer Engineering Department Dhahran31261 Saudi Arabia KFUPM
SDAIA-KFUPM Joint Research Center for Artificial Intelligence IRC for Intelligent Secure Systems and Computer Engineering Department Dhahran31261 Saudi Arabia King Saud University
College of Computer and Information Sciences Research Chair of Pervasive and Mobile Computing Department of Software Engineering Riyadh12372 Saudi Arabia KFUPM
Computer Engineering Department Information and Computer Science Department IRC for Intelligent Secure Systems Dhahran31261 Saudi Arabia
With the widespread adoption of unmanned aerial vehicles (UAVs) in various applications (e.g., aerial transportation, traffic monitoring), there have been apprehensions regarding the associated risks of employing UAVs...
详细信息
Water leakage in distribution networks is a significant challenge, especially in regions with limited infrastructure like Huancayo, Peru, where losses account for 32.82% of the distributed volume. This study introduce...
详细信息
ISBN:
(数字)9798331522216
ISBN:
(纸本)9798331522223
Water leakage in distribution networks is a significant challenge, especially in regions with limited infrastructure like Huancayo, Peru, where losses account for 32.82% of the distributed volume. This study introduces a machine learning-based approach to detect leaks using four algorithms: Autoencoder LSTM, Isolation Forest, One-Class SVM, and K-Nearest Neighbors (KNN). The methodology involved preprocessing historical consumption data (2018–2024) into 12-month temporal sequences per client and evaluating the models based on F1 Score, Precision, and Mean Absolute Error (MAE). Among the algorithms, the Autoencoder LSTM demonstrated superior performance with the highest precision (0.89) and the lowest MAE (0.00402). Its robustness in high-variability contexts enables early and reliable leak detection, providing a cost-effective solution for optimizing water management in resource-constrained environments.
Sperm morphology measurement is vital for diagnosing male infertility, which involves quantification of multiple subcellular parts for each sperm. Instance-aware part segmentation networks have been introduced to addr...
详细信息
In autonomous driving, accurately predicting the movements of other traffic participants is crucial, as it significantly influences a vehicle’s planning processes. Modern trajectory prediction models strive to interp...
Accurate segmentation of the ventricular structures and myocardium from Cardiac Magnetic Resonance (CMR) images is essential to diagnose and manage cardiovascular diseases. This study systematically evaluates the perf...
详细信息
ISBN:
(数字)9798331530983
ISBN:
(纸本)9798331530990
Accurate segmentation of the ventricular structures and myocardium from Cardiac Magnetic Resonance (CMR) images is essential to diagnose and manage cardiovascular diseases. This study systematically evaluates the performance of five U-Net variants in cardiac MRI segmentation using the Automated Cardiac Diagnosis Challenge (ACDC) dataset and a hybrid loss function combining Cross-Entropy and dice losses. Among the variants, the Feature Pyramid U-Net achieved the best performance, with Dice coefficients of 0.9388 (Left Ventricle), 0.8759 (Right Ventricle), and 0.8426 (Myocardium), showcasing its superior ability to capture multi-scale features and segment complex anatomical structures. The comprehensive and standardized evaluation conducted in this study provides valuable insights into the strengths and limitations of these architectures for cardiac segmentation.
Aligning Large Language Models (LLMs) with human values and away from undesirable behaviors (such as hallucination) has become increasingly important. Recently, steering LLMs towards a desired behavior via activation ...
详细信息
Organizations are increasingly moving towards the cloud computing paradigm, in which an on-demand access to a pool of shared configurable resources is provided. However, security challenges, which are particularly exa...
详细信息
Requirements engineering commonly employs UML Use Case Diagrams (UCD) to visually capture system interactions and functionality, facilitating clear communication between stakeholders. Recognizing and extracting semant...
详细信息
ISBN:
(数字)9798331535100
ISBN:
(纸本)9798331535117
Requirements engineering commonly employs UML Use Case Diagrams (UCD) to visually capture system interactions and functionality, facilitating clear communication between stakeholders. Recognizing and extracting semantic information from UCDs is essential for applications such as automated requirements extraction and system design validation, which improves software analysis accuracy, and streamlines model understanding for both developers and stakeholders. Recent advancements in large language models (LLMs) with visual processing capabilities enable interpreting intricate diagrammatic content. This paper evaluates multi-modal LLMs, specifically GPT-4o and GPT-4o-mini, in accurately identifying semantic elements within UCDs. We conducted experiments on a new dataset of UCDs and other diagrams collected from online sources. Experimental results show that both models struggled to accurately identify and interpret key UCD elements, often misclassifying or overlooking essential ones.
暂无评论