Keyhole tungsten inert gas (keyhole TIG) welding is renowned for its advanced efficiency, necessitating a real-time defect detection method that integrates deeplearning and enhanced vision techniques. This study empl...
详细信息
Keyhole tungsten inert gas (keyhole TIG) welding is renowned for its advanced efficiency, necessitating a real-time defect detection method that integrates deeplearning and enhanced vision techniques. This study employs a multi-layer deep neural network trained on an extensive welding image dataset. Neural networks can capture complex nonlinear relationships through multi-layer transformations without manual feature selection. Conversely, the nonlinear modeling ability of support vector machines (SVM) is limited by manually selected kernel functions and parameters, resulting in poor performance for recognizing burn-through and good welds images. SVMs handle only lower-level features such as porosity and excel only in detecting simple edges and shapes. However, neural networks excel in processingdeep feature maps of "molten pools" and can encode deep defects that are often confused in keyhole TIG. Applying a four-class classification task to weld pool images, the neural network adeptly distinguishes various weld states, including good welds, burn-through, partial penetration, and undercut. Experimental results demonstrate high accuracy and real-time performance. A comprehensive dataset, prepared through meticulous preprocessing and augmentation, ensures reliable results. This method provides an effective solution for quality control and defect prevention in keyhole TIG welding process.
Crop diseases significantly threaten global agricultural productivity and food security, leading to economic losses and increased pesticide use, which pollutes soil and water and disrupts ecological balance. Mustard a...
详细信息
Crop diseases significantly threaten global agricultural productivity and food security, leading to economic losses and increased pesticide use, which pollutes soil and water and disrupts ecological balance. Mustard and mung bean crops are particularly affected by various diseases and pests such as Alternaria blight, aphids, charcoal rot, bruchids, and mosaic. timely and accurately identifying these diseases and pests are crucial for effective crop management. This research tackles disease classification in mustard and mung bean crops by employing transfer learning, a MobileNetV3based CNN model, and a System-on-Chip (SoC) computing platform. The processing system and processing logic of SoC enhance computing flexibility. Xilinx deeplearning Processor Unit (DPU) intellectual property (IP) accelerates disease classification 24 times compared to software counterparts. At the same time, our proposed design enhances the throughput by around 29% and reduces the power consumption by around 19%. MobileNetV3 achieves classification accuracies of 96.14% on mung bean and 93.25% on mustard datasets, surpassing other state-of-the-art methods. A vital aspect of this research is developing a user-friendly mobile application for image capture, communication with SoC, and result display, making disease and pest detection more convenient and accessible. The SoC-based system is versatile and can be extended to classify various crop varieties beyond mung bean and mustard without hardware modifications.
The efficiency of intelligent sugarcane harvesters in harvesting depends on the effectiveness of identifying and locating the sugarcane during the harvesting process. In the actual harvesting process, accurately extra...
详细信息
The efficiency of intelligent sugarcane harvesters in harvesting depends on the effectiveness of identifying and locating the sugarcane during the harvesting process. In the actual harvesting process, accurately extracting valid features of sugarcane amidst the dense and interwoven sugarcane becomes a challenging task. To address this issue, we propose a hybrid deeplearning approach to extract sugarcane stem contours and internal stem node feature information from sugarcane efficiently in the context of a complex harvest. Firstly, this study combined the MobileNetV3 and U-Net networks to segment overall images that contain information about the external contours of the sugarcane stem. Then, the extracted overall profile images were optimized using a variety of imageprocessing techniques to meet the requirements of harvesting. Lastly, the improved YOLOX model was utilized to identify the internal stem node features of sugarcane from the optimized overall images. The experimental results on a real sugarcane dataset show that the proposed external sugarcane stem segmentation model achieves a high mean intersection over union (MIoU) of 91.68% with an average segmentation time of just 0.025 seconds. Moreover, the proposed model for internal stem node recognition in sugarcane achieves an average precision (AP) of 96.19% with an average detection time of 0.026 seconds. Additionally, this study compares image segmentation models such as PSPNet and deepLabv3+ with target detection models such as YoloV5 and YoloV7. The experimental results show that the sugarcane feature extraction models proposed in this article all exhibit high accuracy and robustness.
With the advent of deeplearning, there has been an ever-growing list of applications to which deep Convolutional Neural Networks (DCNNs) can be applied. The field of Multi-Task learning (MTL) attempts to provide opti...
详细信息
ISBN:
(纸本)9781510673878;9781510673861
With the advent of deeplearning, there has been an ever-growing list of applications to which deep Convolutional Neural Networks (DCNNs) can be applied. The field of Multi-Task learning (MTL) attempts to provide optimizations to many-task systems, improving performance by optimization algorithms and structural changes to these networks. However, we have found that current MTL optimization algorithms often impose burdensome computation overheads, require meticulously labeled datasets, and do not adapt to tasks with significantly different loss distributions. We propose a new MTL optimization algorithm: Batch Swapping with Multiple Optimizers (BSMO). We utilize single-task labeled data to train on a multi-task hard parameter sharing (HPS) network through swapping tasks at the batch level. This dramatically increases the flexibility and scalability of training on an HPS network by allowing for per-task datasets and augmentation pipelines. We demonstrate the efficacy of BSMO versus current SO TA algorithms by benchmarking across contemporary benchmarks & networks.
Object detection is a key technology for marine exploration. The detection effect is not ideal because of factors such as the biodiversity and overlapping shadows in the underwater environment. Therefore, a new underw...
详细信息
Object detection is a key technology for marine exploration. The detection effect is not ideal because of factors such as the biodiversity and overlapping shadows in the underwater environment. Therefore, a new underwater object detection algorithm called RCF-YOLO is proposed. First, a coordinate enhancement (CE) attention module is designed. Depth-separable convolutions are used to extract the location information of the channel and combine it with spatial information to improve the model's ability to infer global features. Second, we have redesigned the neck with the BiFPN concept, which enhances feature interaction capabilities and optimizes the inference structure. The convolutional operation in the neck path is improved to enhance cross-scale connections, effectively integrating shallow and deep features, achieving a good balance between efficiency and accuracy. Finally, the receptive field convolution (RFAConv) is introduced to solve the parameter sharing problem in complex convolution processing, making the model more flexible in adjusting the convolution kernel weights and more effectively capturing the information in the image. The proposed model was compared with several sets of experiments on the URPC, DUO, and ROUD datasets. With a decrease in both the number of parameters and the complexity of the calculation, the accuracy reached 85.3%, 87.9%, and 84.9%. The experimental results show that the RCF-YOLO model has excellent performance in the underwater detection task.
The distribution characteristics and geometric morphology characteristics of defects within RFC are important factors affecting the strength properties and rupture morphology of RFC. However, the excessive size of com...
详细信息
The distribution characteristics and geometric morphology characteristics of defects within RFC are important factors affecting the strength properties and rupture morphology of RFC. However, the excessive size of commonly used aggregates for RFC leads to difficulties in conducting in-depth experimental studies indoors. Based on the improved U-Net and imageprocessing technology, this research establishes an integrated model for the identification, classification, and extraction of defects inside the RFC, quantitatively counts and analyzes the acquired defect distribution characteristics and geometrical morphology characteristics, and establishes a defect characteristic distribution function that can be used for the numerical reconstruction of defects. In order to realize the acceleration of U-Net training using training weights, use VGG-16 with the fully connected layer removed instead of the Encoder part of the U-Net. The integrated model in this research can realize automatic identification, classification, and extraction of multiple types of defects at the same time, and the established distribution function of defect characteristics provides a data basis and new ideas for the establishment of RFC three-dimensional numerical models containing real defects.
Deformation monitoring of Gas-Insulated Transmission Lines (GILs) is critical for the early detection of structural issues and for ensuring safe power transmission. In this study, we introduce a rapid monocular measur...
详细信息
Deformation monitoring of Gas-Insulated Transmission Lines (GILs) is critical for the early detection of structural issues and for ensuring safe power transmission. In this study, we introduce a rapid monocular measurement method that leverages deeplearning for real-time monitoring. A YOLOv10 model is developed for automatically identifying regions of interest (ROIs) that may exhibit deformations. Within these ROIs, grayscale data is used to dynamically set thresholds for FAST corner detection, while the Shi-Tomasi algorithm filters redundant corners to extract unique feature points for precise tracking. Subsequent subpixel refinement further enhances measurement accuracy. To correct image tilt, ArUco markers are employed for geometric correction and to compute a scaling factor based on their known edge lengths, thereby reducing errors caused by non-perpendicular camera angles. Simulated experiments validate our approach, demonstrating that combining refined ArUco marker coordinates with manually annotated features significantly improves detection accuracy. Our method achieves a mean absolute error of no more than 1.337 mm and a processing speed of approximately 0.024 s per frame, meeting the precision and efficiency requirements for GIL deformation monitoring. This integrated approach offers a robust solution for long-term, real-time monitoring of GIL deformations, with promising potential for practical applications in power transmission systems.
Accurate fish segmentation in underwater videos is challenging due to low visibility, variable lighting, and dynamic backgrounds, making fully-supervised methods that require manual annotation impractical for many app...
详细信息
Accurate fish segmentation in underwater videos is challenging due to low visibility, variable lighting, and dynamic backgrounds, making fully-supervised methods that require manual annotation impractical for many applications. This paper introduces a novel self-supervised learning approach for fish segmentation using deeplearning. Our model, trained without manual annotation, learns robust and generalizable representations by aligning features across augmented views and enforcing spatial-temporal consistency. We demonstrate its effectiveness on three challenging underwater video datasets: deepFish, Seagrass, and YouTube-VOS, surpassing existing self-supervised methods and achieving segmentation accuracy comparable to fully-supervised methods without the need for costly annotations. Trained on deepFish, our model exhibits strong generalization, achieving high segmentation accuracy on the unseen Seagrass and YouTube-VOS datasets. Furthermore, our model is computationally efficient due to its parallel processing and efficient anchor sampling technique, making it suitable for real-time applications and potential deployment on edge devices. We present quantitative results using Jaccard Index and Dice coefficient, as well as qualitative comparisons, showcasing the accuracy, robustness, and efficiency of our approach for advancing underwater video analysis.
Tomatoes are the most valuable vegetable worldwide that suffer from leaf diseases, which affect long-term tomato protection. So, to protect the tomato plants from the leaf diseases, it is essential to perform appropri...
详细信息
Tomatoes are the most valuable vegetable worldwide that suffer from leaf diseases, which affect long-term tomato protection. So, to protect the tomato plants from the leaf diseases, it is essential to perform appropriate control measures through early and accurate categorization of leaf diseases. Recently, automated deeplearning-based methods, including convolutional neural networks (CNNs), guaranteed accurate and timely classification of tomato leaf diseases. However, CNNs primarily capture local context features within a limited receptive field, making them effective for uniform background images. To handle complex background images, utilizing local and global context features is essential for accurate classification. To do so, it is essential to hybrid CNN architecture with other deeplearning modules. This work suggests the TrioConvTomatoNet-BiLSTM framework, a hybridization of CNN architecture named TrioConvTomatoNet with a sequence module named bidirectional long short-term memory (BiLSTM). The proposed framework integrated both local and global context features for the precise classification of images with complex backgrounds. As a result, the proposed framework achieves remarkable accuracy of 99.65%, 98.83%, and 99.20% in classifying tomato leaf disease images with non-uniform, synthetic, and real-time complex backgrounds against the TrioConvTomatoNet and TrioConvTomatoNet-LSTM frameworks. Despite the fact that it requires a lesser number of training parameters and attained maximum accuracy over other existing hybrid approaches, expresses its superiority, robustness, and practical applicability. These features highlight the potential of the proposed framework in the emerging field of smart agriculture by enabling smartphone-based classification of tomato leaf diseases with real-life scenarios.
Modern wafer inspection systems in Integrated Circuit (IC) manufacturing utilize deep neural networks. The training of such networks requires the availability of a very large number of defective or faulty die patterns...
详细信息
ISBN:
(纸本)9781510673878;9781510673861
Modern wafer inspection systems in Integrated Circuit (IC) manufacturing utilize deep neural networks. The training of such networks requires the availability of a very large number of defective or faulty die patterns on a wafer called wafer maps. The number of defective wafer maps on a production line is often limited. In order to have a very large number of defective wafer maps for the training of deep neural networks, generative models can be utilized to generate realistic synthesized defective wafer maps. This paper compares the following three generative models that are commonly used for generating synthesized images: Generative Adversarial Network (GAN), Variational Auto-Encoder (VAE), and CycleGAN which is a variant of GAN. The comparison is carried out based on the public domain wafer map dataset WM-811K. The quality aspect of the generated wafer map images is evaluated by computing the five metrics of peak signal-to-noise ratio (PSNR), structural similarity index measure (SSIM), inception score (IS), Frechet inception distance (FID), and kernel inception distance (KID). Furthermore, the computational efficiency of these generative networks is examined in terms of their deployment in a real-time inspection system.
暂无评论