This study investigates the application of Learnable Memory Vision Transformers(LMViT)for detecting metal surface flaws,comparing their performance with traditional CNNs,specifically ResNet18 and ResNet50,as well as o...
详细信息
This study investigates the application of Learnable Memory Vision Transformers(LMViT)for detecting metal surface flaws,comparing their performance with traditional CNNs,specifically ResNet18 and ResNet50,as well as other transformer-based models including Token to Token ViT,ViT withoutmemory,and Parallel *** awidely-used steel surface defect dataset,the research applies data augmentation and t-distributed stochastic neighbor embedding(t-SNE)to enhance feature extraction and *** techniques mitigated overfitting,stabilized training,and improved generalization *** LMViT model achieved a test accuracy of 97.22%,significantly outperforming ResNet18(88.89%)and ResNet50(88.90%),aswell as the Token to TokenViT(88.46%),ViT without memory(87.18),and Parallel ViT(91.03%).Furthermore,LMViT exhibited superior training and validation performance,attaining a validation accuracy of 98.2%compared to 91.0%for ResNet 18,96.0%for ResNet50,and 89.12%,87.51%,and 91.21%for Token to Token ViT,ViT without memory,and Parallel ViT,*** findings highlight the LMViT’s ability to capture long-range dependencies in images,an areawhere CNNs struggle due to their reliance on local receptive fields and hierarchical feature *** additional transformer-based models also demonstrate improved performance in capturing complex features over CNNs,with LMViT excelling particularly at detecting subtle and complex defects,which is critical for maintaining product quality and operational efficiency in industrial *** instance,the LMViT model successfully identified fine scratches and minor surface irregularities that CNNs often *** study not only demonstrates LMViT’s potential for real-world defect detection but also underscores the promise of other transformer-based architectures like Token to Token ViT,ViT without memory,and Parallel ViT in industrial scenarios where complex spatial relationships are *** research m
Artificial intelligence (AI) breakthroughs have created new opportunities in the field of medical diagnostics, especially for the early identification of respiratory conditions like Chronic Obstructive Pulmonary Disea...
详细信息
Graph neural networks (GNNs) have shown outstanding performance in graph node classification. However, as a deep learning model, GNNs can be influence by adversarial attacks, such as graph injection attacks or graph m...
详细信息
The Sign Language Recognition System has been designed to capture video input, process it to detect hand gestures, and translate these gestures into readable text. The project consists of several key components and st...
详细信息
The maintenance and enhancement of dynamic soil characteristics are the primary focus of soil management in agriculture to increase crop productivity. Higher productivity may result from efficient soil control of reso...
详细信息
Story video-text alignment, a core task in computational story understanding, aims to align video clips with corresponding sentences in their descriptions. However, progress on the task has been held back by the scarc...
详细信息
Cloud computing solutions are becoming more and more popular as a way for organizations to improve productivity, save costs, and simplify procedures. The advantage of cloud services is that they enable users to store ...
详细信息
Deep learning has gained tremendous success in various fields while training deep neural networks(DNNs) is very compute-intensive, which results in numerous deep learning frameworks that aim to offer better usability ...
详细信息
Deep learning has gained tremendous success in various fields while training deep neural networks(DNNs) is very compute-intensive, which results in numerous deep learning frameworks that aim to offer better usability and higher performance to deep learning practitioners. Tensor Flow and Py Torch are the two most popular frameworks. Tensor Flow is more promising within the industry context, while Py Torch is more appealing in academia. However, these two frameworks differ much owing to the opposite design philosophy:static vs dynamic computation graph. Tensor Flow is regarded as being more performance-friendly as it has more opportunities to perform optimizations with the full view of the computation graph. However, there are also claims that Py Torch is faster than Tensor Flow sometimes, which confuses the end-users on the choice between them. In this paper, we carry out the analytical and experimental analysis to unravel the mystery of comparison in training speed on single-GPU between Tensor Flow and Py Torch. To ensure that our investigation is as comprehensive as possible, we carefully select seven popular neural networks, which cover computer vision, speech recognition, and natural language processing(NLP). The contributions of this work are two-fold. First, we conduct the detailed benchmarking experiments on Tensor Flow and Py Torch and analyze the reasons for their performance difference. This work provides the guidance for the end-users to choose between these two frameworks. Second, we identify some key factors that affect the performance,which can direct the end-users to write their models more efficiently.
data valuation quantifies the contribution of each data point to the performance of a machine learning model. Existing works typically define the value of data by its improvement of the validation performance of the t...
data valuation quantifies the contribution of each data point to the performance of a machine learning model. Existing works typically define the value of data by its improvement of the validation performance of the trained model. However, this approach can be impractical to apply in collaborative machine learning and data marketplace since it is difficult for the parties/buyers to agree on a common validation dataset or determine the exact validation distribution a priori. To address this, we propose a distributionally robust data valuation approach to perform data valuation without known/fixed validation distributions. Our approach defines the value of data by its improvement of the distributionally robust generalization error (DRGE), thus providing a worst-case performance guarantee without a known/fixed validation distribution. However, since computing DRGE directly is infeasible, we propose using model deviation as a proxy for the marginal improvement of DRGE (for kernel regression and neural networks) to compute data values. Furthermore, we identify a notion of uniqueness where low uniqueness characterizes low-value data. We empirically demonstrate that our approach outperforms existing data valuation approaches in data selection and data removal tasks on real-world datasets (e.g., housing price prediction, diabetes hospitalization prediction). Copyright 2024 by the author(s)
India being an agricultural country, food quality tracking is a major challenge faced by common farmers across the country. This research presents an innovative integration of Convolutional Neural Networks (CNNs) to a...
详细信息
暂无评论