It's widely accepted that human expressions, considering for roughly sixty percent of all daily interactions, are among the most authentic forms of communication. Numerous studies are being conducted to explore th...
详细信息
It's widely accepted that human expressions, considering for roughly sixty percent of all daily interactions, are among the most authentic forms of communication. Numerous studies are being conducted to explore the importance of facial expressions and the development of machine-assisted recognition techniques. Significant progress is being made in facial and expression recognition, largely due to the rapid growth of machine learning and computer vision. A variety of algorithmic approaches and methods exist for detecting and recognizing facial expressions and features. This study investigates various optimization algorithms used with convolutional neural networks for facial expression recognition. The primary focus is on Adam, RMSProp, stochastic gradient descent and AdaMax optimizers. A comprehensive comparison is being made, examining the key aspects of each optimizer, including its advantages and disadvantages. Furthermore, the study also incorporates findings from recent studies that used these optimizers in various applications, highlighting their performance in terms of training time and precision. The aim is to illuminate the process of selecting a suitable optimizer for specific applications, analysing the trade-offs between training speed and higher accuracy levels. Moreover, this study provides a deeper analysis of the role optimizers play in machine learning-based facial expression recognition models. The discussion of the technical challenges posed by these optimizers and future improvements for achieving much more optimal results concludes the study.
Haze degrades the accuracy of computer vision algorithms for railway monitoring applications. This study proposed a dehazing algorithm emphasizing image quality, performed on GPUs and CPUs rather than on the IoT or lo...
详细信息
ISBN:
(数字)9798331520403
ISBN:
(纸本)9798331520410
Haze degrades the accuracy of computer vision algorithms for railway monitoring applications. This study proposed a dehazing algorithm emphasizing image quality, performed on GPUs and CPUs rather than on the IoT or low power devices, as a CNN-based haze removal model for an edge device. Our optimized CNN model utilized the ADD operation to relieve concatenation between two convolutional layers. The model was tested on a TFlite file, with results showing acceptable SSIM and PSNR image quality assessment values. The average computational time was $\mathbf{2 1 0}$ milliseconds on an Intel Xeon processor while a Raspberry Pi 4 Model B, operating as an offline edge inference device, achieved a processing time of $\mathbf{2 8 0}$ milliseconds per image.
In contemporary industrial, robotics, and technical education settings, the efficient detection and sorting of electronic components play a pivotal role in advancing automation and increasing efficiency in these secto...
详细信息
In contemporary industrial, robotics, and technical education settings, the efficient detection and sorting of electronic components play a pivotal role in advancing automation and increasing efficiency in these sectors. To address this need, we present "ElectroCom61," a comprehensive multi- class object detection dataset encompassing 61 commonly used electronic components. Our dataset, sourced from the electronic components collection at United International University (UIU) in Dhaka, Bangladesh, comprises 2121 meticulously annotated images. We ensured that these images reflect real-world conditions, incorporating varied lighting, backgrounds, distances, and camera angles to bolster the potential machine learning model's robustness. We also divided the dataset into training, validation, and test sets to facilitate deep learning model development. Additionally, we conducted minimal pre-processing to optimise model training and performance. "ElectroCom61" stands as a valuable asset for developing cutting-edge electronic component detection systems, with far-reaching applications in both education and industry. Its potential applications span from interactive educational tools to e-waste management systems and streamlined inventory management processes in electronic manufacturing and automation. The code for technical validation of this dataset is available on GitHub: https: //***/faiyazabdullah/ElectroCom61 (c) 2025 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY license (http://***/licenses/by/4.0/)
Computational grids have become an imperative rising platform for high-performance computing. However, the grid and the grid applications development are still far from being affirmed, which is mainly due to the undev...
详细信息
ISBN:
(纸本)9780769533155
Computational grids have become an imperative rising platform for high-performance computing. However, the grid and the grid applications development are still far from being affirmed, which is mainly due to the undeveloped grid-enabled computing environments. For that reason in this paper we propose a toolbox, called GrIPLab 1.0 (Grid imageprocessing Laboratory), that aims at providing high performance image-processing platform in a grid computing environment by using the GLite middleware developed in the EGEE project. GrIPLab 1.0 is a combination of vision algorithms (the most common and some novel approaches) on which complex distributed visionapplications can be modeled as a simple sequence of choices in a user friendly interface. Therefore, the main advantage of the presented dynamic Grid toolbox is that provides a novel and comfortable access for scientific software developers and users without prior knowledge of Grid technologies or even the underlying architecture. In this paper, we discuss the infrastructure that provides flexible and useful mechanism to achieve series imageprocessing operations and we analyze the advantages of using such a system.
Thermography is an innovative technique that creates a heatmap of different portions of the human body by using infrared (IR) radiation where color scales vary for various temperature regions. This work's scope ai...
详细信息
Diabetic Macular Edema (DME) is a disease of the eye's retina and it's a major factor of causing vision problems and leads towards blindness if it is undiagnosed. Early detection of DME can prevent vision loss...
详细信息
ISBN:
(数字)9798331539696
ISBN:
(纸本)9798331539702
Diabetic Macular Edema (DME) is a disease of the eye's retina and it's a major factor of causing vision problems and leads towards blindness if it is undiagnosed. Early detection of DME can prevent vision loss and may reduce diabetic-related problems like cardiovascular issues. Therefore, this study presents a method for detecting DME through Optical Coherence Tomography (OCT) and fundus images using transfer learning. The proposed method is based on four stages: Pre-processing, augmentation, segmentation through DeepLabV3+, and binary classification of DME by using two (02) publicly available datasets; the first dataset is Messidor-2, which contains fundus images, and the second dataset is Retinal images of Optical Coherence Tomography (OCT). In the Messidor-2 dataset, the total number of images is 1744, and Retinal OCT images dataset consists of 84,495 images total, separated into four categories (CNV, DME, DRUSEN, NORMAL). In the proposed method, Convolutional Neural Networks (CNN) architectures ResNet50 and VGG-19 have been used for the detection of the DME. Convolutional Neural Networks (CNNs) have been extensively used in medical imaging analysis and classification. Using the well-known ResNet50 architecture for classification of each dataset, the proposed model yielded an accuracy of 98.79%, 99% of F1 Score, 98.43% of Precision, and recall of 98.89%. By using VGG-19, the proposed model gives an accuracy of 98.81%, 98.94% of Fl Score, 98.1 % of Precision and recall of 98.73%. When both models (ResNet50 and VGG-19) were compared, the VGG-19 gave the best accuracy.
In spite of achieving significant progress in recent years, Large vision-Language Models (LVLMs) are proven to be vulnerable to adversarial examples. Therefore, there is an urgent need for an effective adversarial att...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
In spite of achieving significant progress in recent years, Large vision-Language Models (LVLMs) are proven to be vulnerable to adversarial examples. Therefore, there is an urgent need for an effective adversarial attack to identify the deficiencies of LVLMs in security-sensitive applications. However, existing LVLM attackers generally optimize adversarial samples against a specific textual prompt with a certain LVLM model, tending to overfit the target prompt/network and hardly remain malicious once they are transferred to attack a different prompt/model. To this end, in this paper, we propose a novel Imperceptible Transfer Attack (ITA) against LVLMs to generate prompt/model-agnostic adversarial samples to enhance such adversarial transferability while further improving the imperceptibility. Specifically, we learn to apply appropriate visual transformations on image inputs to create diverse input patterns by selecting the optimal combination of operations from a pool of candidates, consequently improving adversarial transferability. We conceptualize the selection of optimal transformation combinations as an adversarial learning problem and employ a gradient approximation strategy with noise budget constraints to effectively generate imperceptible transferable samples. Extensive experiments on three LVLM models and two widely used datasets with three tasks demonstrate the superior performance of our ITA.
One of the main causes of cancer-related deaths is lung cancer, and increasing survival rates requires early detection. The use of sophisticated machine learning (ML) algorithms to improve the identification of lung c...
详细信息
ISBN:
(数字)9798331512248
ISBN:
(纸本)9798331512255
One of the main causes of cancer-related deaths is lung cancer, and increasing survival rates requires early detection. The use of sophisticated machine learning (ML) algorithms to improve the identification of lung cancer from chest X-rays and CT images is investigated in this study. vision Transformers (ViT), Reinforcement Learning (RL), Generative Adversarial Networks (GANs), Meta-Learning, and Ensemble Learning are some of the state-of-the-art methods we use. Compared to conventional CNNs, By capturing long-range dependencies in medical pictures, vision Transformers (ViT) can improve accuracy by up to 85–96% compared to typical CNN models, especially when it comes to feature extraction and categorization. The optimization of diagnostic workflows using Reinforcement Learning (RL) results in a 90% improvement in decision-making efficiency and adaptive learning capabilities, which greatly enhance real-time picture analysis. Training datasets can be enhanced with Generative Adversarial Networks (GANs), which create realistic synthetic images and increase model generalization by 90–97%. This is crucial in situations where data is scarce or unbalanced. For uncommon or underrepresented cancer cases, meta-learning improves classification accuracy by 90% by allowing models to learn from sparsely labeled data. Ensemble learning reduces bias and variation by combining many models using approaches like XGBoost, Bagging, and Stacking, increasing total accuracy by 80–95%. When compared to conventional methods, key performance indicators including precision, recall, and F1 score demonstrate significant gains, with sensitivity rising by up to 90% and specificity improving by 85%. These cutting-edge algorithms greatly improve the detection of lung cancer, facilitating quicker and more precise diagnoses and assisting clinicians in making decisions that will benefit patients.
image classification is one of the fundamental task in the digital imageprocessing and computer vision. It is very essential to get high accuracy in it which requires substantial computational resources and a large d...
详细信息
ISBN:
(数字)9798350356236
ISBN:
(纸本)9798350356243
image classification is one of the fundamental task in the digital imageprocessing and computer vision. It is very essential to get high accuracy in it which requires substantial computational resources and a large dataset for the training using deep neural networks. Transfer learning technique is one of the solution which mitigate various challenges by leveraging pre-trained models on large datasets. In this paper; the effectiveness of the transfer technique is exhibited for the purpose of enhancing the image classification. A comprehensive survey has been done of the recent advancement for the purpose to investigate the strategies of transfer learning technique including feature extraction, fine tuning and domain adaption. Experimental results exhibits the efficacy of the transfer learning in the improvement of classification accuracy with less computational complexity. This paper also discusses practical consideration, challenges and future research scope for the utilization of transfer learning for image classification applications.
暂无评论