咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >ViT-SENet-Tom: machine learnin... 收藏

ViT-SENet-Tom: machine learning-based novel hybrid squeeze–excitation network and vision transformer framework for tomato fruits classification

作     者:Swapno, S M Masfequier Rahman Nobel, S. M. Nuruzzaman Islam, Md Babul Bhattacharya, Pronaya Mattar, Ebrahim A. 

作者机构:Department of CSE Bangladesh University of Business and Technology Dhaka1216 Bangladesh Department of Computer Modeling Electronic and System Engineering UNICAL Rende Italy Department of Computer Science and Engineering Amity School of Engineering and Technology and Research and Innovation Cell Amity University West Bengal Kolkata India Robotics - Cybernetic College of Engineering University of Bahrain Manama Bahrain 

出 版 物:《Neural Computing and Applications》 (Neural Comput. Appl.)

年 卷 期:2025年第37卷第9期

页      面:6583-6600页

核心收录:

学科分类:08[工学] 0901[农学-作物学] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

基  金:No funding support is available 

主  题:Fruits 

摘      要:Tomatoes are essential fruits in numerous nations for their vast demand. It is very important to maintain the freshness of tomatoes. One of the primary challenges in the recent culinary landscape is accurately identifying healthy tomatoes while effectively eliminating damaged or rejected ones. Existing approaches employ various strategies for categorizing tomato fruit, but they often suffer from inaccuracies, slow detection, and suboptimal performance. Thus, motivated by this gap, in this paper, we propose a novel machine learning (ML) framework, ViT-SENet-Tom, which is a hybrid vision transformer (ViT) model with squeeze and excitation (SENet) block network for fast, accurate, and efficient tomato fruit classification. The framework works on three tomato classes, respectively, the ripe, unripe, and reject. In developing the proposed model, we utilized advanced and newly designed layers and functions. This integration created a more complex and sophisticated neural network, significantly enhancing efficiency and contributing to the model’s novelty. Our chosen dataset was small initially, but we implemented augmentation techniques to increase its size. This approach made our system more reliable, efficient, and effective. The hybrid ViT-SENet framework employs encoders and self-attention networks with squeeze and excitation channel functions to allow precise, robust, fast, and efficient tomato classification. In simulation, the framework achieves a training accuracy of 99.87% and validation accuracy of 93.87%, indicating the precise classification of tomatoes. Besides, this work tests accuracy using fivefold cross-validation. The highest accuracy seen at fold-5 is 99.90%. These testing results demonstrate the efficacy of the proposed framework in real-deployment scenarios. The implementation has the potential to provide enhanced and more sustainable food security and safety in future. © The Author(s) 2025.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分