Remote sensing scene classification has been extensively studied for its critical roles in geological survey, oil exploration, traffic management, earthquake prediction, wildfire monitoring, and intelligence monitorin...
详细信息
ISBN:
(纸本)9781510666931;9781510666948
Remote sensing scene classification has been extensively studied for its critical roles in geological survey, oil exploration, traffic management, earthquake prediction, wildfire monitoring, and intelligence monitoring. In the past, the machine Learning (ML) methods for performing the task mainly used the backbones pretrained in the manner of supervised learning (SL). As Masked image Modeling (MIM), a self-supervised learning (SSL) technique, has been shown as a better way for learning visual feature representation, it presents a new opportunity for improving ML performance on the scene classification task. This research aims to explore the potential of MIM pretrained backbones on four well-known classification datasets: Merced, AID, NWPU-RESISC45, and Optimal-31. Compared to the published benchmarks, we show that the MIM pretrained vision Transformer (ViTs) backbones outperform other alternatives (up to 18% on top 1 accuracy) and that the MIM technique can learn better feature representation than the supervised learning counterparts (up to 5% on top 1 accuracy). Moreover, we show that the general-purpose MIM-pretrained ViTs can achieve competitive performance as the specially designed yet complicated Transformer for Remote Sensing (TRS) framework. Our experiment results also provide a performance baseline for future studies.
The proceedings contain 128 papers. The special focus in this conference is on Data Science, machine Learning and applications. The topics include: Digitization of Monuments – An Impact on the Tourist Experience with...
ISBN:
(纸本)9789819780303
The proceedings contain 128 papers. The special focus in this conference is on Data Science, machine Learning and applications. The topics include: Digitization of Monuments – An Impact on the Tourist Experience with Special Reference to Hampi;resume Parser Using machine Learning;IOT Based Smart Hydroponics System;comparative Study of machine Learning and Deep Learning Techniques for Cancer Disease Detection;High Thruput Modulation Approaches Used in Next Generation WiF’s Under Multi-impairments Environments with MATLAB Codes;skin Disease Detection;root Vegetable Crop Recommendation System Based on Soil Properties and Environmental Factors;deep Learning Model Development for an Automatic Healthcare Edge Computing Application;Empathetic Conversations in Mental Health: Fine-Tuning LLMs for Supportive AI Interactions;exploring Block Chain Technology with applications, and Future Prospects;a Comprehensive Review of Soft Computing Enabled Techniques for IoT Security: State-of-the-Art and Challenges Ahead;Performance Analysis of machine Learning Algorithms on Imbalanced Datasets Using SMOTE Technique;An AI Based Nutrient Tracking and Analysis System;power Saving Mechanism for Street Lights System Using IoT;Automatic Login System Using ATTINY85 IC;forecasting Stock Prices: A Comparative Analysis of machine Learning, Deep Learning, and Statistical Approaches;smart vision Bot;robots in Logistics: Apprehension of Current Status and Future Trends in Indian Warehouses;smart Healthcare: Enhancing Patient Well-Being with IoT;Detection of B-ALL Using CNN Model and Deep Learning;a Comprehensive Analysis for Advancements and Challenges in Deep Learning Models for imageprocessing;a Comprehensive Survey on Enhancing Patient Care Through Deep Learning and IoT-Enabled Healthcare Innovations;attention-Based image Caption Generation.
Transformer-based Deep Neural Network architectures have gained tremendous interest due to their effectiveness in various applications across Natural Language processing (NLP) and Computer vision (CV) domains. These m...
详细信息
Transformer-based Deep Neural Network architectures have gained tremendous interest due to their effectiveness in various applications across Natural Language processing (NLP) and Computer vision (CV) domains. These models are the de facto choice in several language tasks, such as Sentiment Analysis and Text Summarization, replacing Long Short Term Memory (LSTM) model. vision Transformers (ViTs) have shown better model performance than traditional Convolutional Neural Networks (CNNs) in visionapplications while requiring significantly fewer parameters and training time. The design pipeline of a neural architecture for a given task and dataset is extremely challenging as it requires expertise in several interdisciplinary areas such as signal processing, imageprocessing, optimization and allied fields. Neural Architecture Search (NAS) is a promising technique to automate the architectural design process of a Neural Network in a data-driven way using machine Learning (ML) methods. The search method explores several architectures without requiring significant human effort, and the searched models outperform the manually built networks. In this paper, we review Neural Architecture Search techniques, targeting the Transformer model and its family of architectures such as Bidirectional Encoder Representations from Transformers (BERT) and vision Transformers. We provide an in-depth literature review of approximately 50 state-of-the-art Neural Architecture Search methods and explore future directions in this fast-evolving class of problems.
The exploration of sentiments through facial expressions is a captivating domain with applications across security, healthcare, and human–computer interaction, where understanding sentiments is primarily about interp...
详细信息
This paper aims to design and implement an MLbased approach to learn from NeuroAqua - the AI and IoT-based aquaponics system set up in our previous research at both a lab setting and larger-scale Ouroboros Aquaponics ...
详细信息
ISBN:
(纸本)9798350372977;9798350372984
This paper aims to design and implement an MLbased approach to learn from NeuroAqua - the AI and IoT-based aquaponics system set up in our previous research at both a lab setting and larger-scale Ouroboros Aquaponics Farm (Half Moon Bay, CA) to enhance system stability and efficiency. Utilizing the data gathered from the wireless sensors, a structured database was formed to store the aquaponics environmental conditions, water quality, nutrient components, and plant images. We used the ML model to find the important factors having the largest impact on plant growth and their optimal amount levels. First, computer vision with imageprocessing was applied to develop auto plant growth monitoring and to measure plant growth rate as the target variable more accurately and automatically for ML. Then feature engineering on the input variables was performed to enhance model performance and accuracy for a smaller dataset. ML algorithms including Linear Regression, Bagging Regressor, Decision Tree, Random Forest, XGBoost and Artificial Neural Network were applied and evaluated based on key performance metrics. The findings show that XGBoost outperformed the other models with 91.6% accuracy and also had the lowest MAE. Random Forest came in second with 90.9% accuracy and then Bagging Regressor in third with 88.5% accuracy. Lastly, according to the feature importance analysis conducted on the best model of XGBoost, Nitrogen had the largest impact on plant growth, followed by Nitrate, Nitrite, Light, and Phosphorus. Hence the initial results would recommend to closely monitor these top important factors together with plant growth in NeuroAqua's monitoring applications.
In the course of modernization of camera based imaging and image analysis for accelerator hardware and beam control at the ELSA facility, a distributed imageprocessing approach was implemented, called FGrabbit. We ut...
详细信息
The practical deployment of machinevision presents particular challenges for resource constrained edge devices. With a clear need to execute multiple tasks with variable workloads, there is a need for a robust approa...
详细信息
The practical deployment of machinevision presents particular challenges for resource constrained edge devices. With a clear need to execute multiple tasks with variable workloads, there is a need for a robust approach that can dynamically adapt at runtime and which can maintain the maximum quality of service (QoS) within the available resource constraints. A lightweight approach that monitors the runtime workload constraints and leverages accuracy-throughput trade-offs on a graphics processing unit (GPU), is presented. It includes optimisation techniques that identify the configurations for each task in terms of optimal accuracy, energy and memory and management of the transparent switching between configurations. Using a neural network architecture search that statically generates a range of implementations that target a resource-precision trade-off, we explore the detection of the optimal parameters for the required QoS under specific memory and energy constraints. For an accuracy loss of 1%, we demonstrate that a 1.6x higher frame processing rate can be achieved on GPU with further improvements possible at further relaxed accuracy. In order to further improve the switching between configurations, we enhance the proposed mechanism by employing central processing units (CPUs) for offloading some of the executed frames, which helps to improve the frame rate by further 0.9%.
Diabetic retinopathy (DR) is an impediment of diabetes mellitus, which if not treated early may result in complete loss of vision, even without any preemptive symptoms. DR is caused by high level of glucose in the blo...
详细信息
Diabetic retinopathy (DR) is an impediment of diabetes mellitus, which if not treated early may result in complete loss of vision, even without any preemptive symptoms. DR is caused by high level of glucose in the blood, causing alterations in the microvasculature of retina. However, early screening of diabetic patients through retinal fundus imaging, along with proper diagnosis and treatment can control the prevalence of DR complications. Manual inspection of pathological changes in retinal fundus images is an extremely challenging and tedious task. Therefore, computer-aided diagnosis (CAD) system is an efficient and effective method for early detection of DR and can greatly assist the ophthalmologists. CAD system encompasses DR detection and severity grading that includes detection, classification, localization and segmentation of lesions from the fundus images. Significant contributions have been made in DR severity grading using conventional imageprocessing approaches using hand-engineered features and traditional machine-learning (ML) techniques. In the recent years, significant development of deep learning (DL) methods alleviated by the advancement of hardware computation power and efficient learning algorithms, has triumphed over the traditional ML methods in DR detection and grading tasks. Many researchers have employed the established as well as customized DL models in different DR image repositories and reported their findings. In this paper, we conduct a detailed review of the recent state-of-the-art contributions in the field of DL based DR classification by explaining their methodologies and highlighting their advantages and limitations. A detailed comparative study based on certain statistical parameters has also been conducted to quantitatively evaluate the methods, models and preprocessing techniques. In addition, the challenges in designing an efficient, accurate and robust deep-learning model for DR classification are explored in details to help t
Drill pipe joint’s thread quality directly affects the machining performance and the drill pipe’s service life. machinevision can quickly detect thread parameters to determine the thread processing quality, but thi...
详细信息
Robotic harvesting of fruits and vegetables is an advanced technology that leverages Robotics, Artificial Intelligence, and machinevision to harvest the fruits autonomously from plants or trees. This technology aims ...
详细信息
暂无评论