咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Steel Surface Defect Detection... 收藏

Steel Surface Defect Detection Using Learnable Memory Vision Transformer

作     者:Syed Tasnimul Karim Ayon Farhan Md.Siraj Jia Uddin 

作者机构:Department of Computer Science and EngineeringBRAC UniversityDhaka1212Bangladesh Department of AI and Big DataEndicott CollegeWoosong UniversityDaejeon34606Republic of Korea 

出 版 物:《Computers, Materials & Continua》 (计算机、材料和连续体(英文))

年 卷 期:2025年第82卷第1期

页      面:499-520页

核心收录:

学科分类:08[工学] 080203[工学-机械设计及理论] 0802[工学-机械工程] 

基  金:funded by Woosong University Academic Research 2024 

主  题:Learnable Memory Vision Transformer(LMViT) Convolutional Neural Networks(CNN) metal surface defect detection deep learning,computer vision image classification learnable memory gradient clipping label smoothing t-SNE visualization 

摘      要:This study investigates the application of Learnable Memory Vision Transformers(LMViT)for detecting metal surface flaws,comparing their performance with traditional CNNs,specifically ResNet18 and ResNet50,as well as other transformer-based models including Token to Token ViT,ViT withoutmemory,and Parallel *** awidely-used steel surface defect dataset,the research applies data augmentation and t-distributed stochastic neighbor embedding(t-SNE)to enhance feature extraction and *** techniques mitigated overfitting,stabilized training,and improved generalization *** LMViT model achieved a test accuracy of 97.22%,significantly outperforming ResNet18(88.89%)and ResNet50(88.90%),aswell as the Token to TokenViT(88.46%),ViT without memory(87.18),and Parallel ViT(91.03%).Furthermore,LMViT exhibited superior training and validation performance,attaining a validation accuracy of 98.2%compared to 91.0%for ResNet 18,96.0%for ResNet50,and 89.12%,87.51%,and 91.21%for Token to Token ViT,ViT without memory,and Parallel ViT,*** findings highlight the LMViT’s ability to capture long-range dependencies in images,an areawhere CNNs struggle due to their reliance on local receptive fields and hierarchical feature *** additional transformer-based models also demonstrate improved performance in capturing complex features over CNNs,with LMViT excelling particularly at detecting subtle and complex defects,which is critical for maintaining product quality and operational efficiency in industrial *** instance,the LMViT model successfully identified fine scratches and minor surface irregularities that CNNs often *** study not only demonstrates LMViT’s potential for real-world defect detection but also underscores the promise of other transformer-based architectures like Token to Token ViT,ViT without memory,and Parallel ViT in industrial scenarios where complex spatial relationships are *** research m

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分