The surface quality of aluminium alloy castings is crucial to quality control. Aiming to address the challenges of limited samples and extensive computation in deeplearning-based surface defect detection for aluminiu...
详细信息
The surface quality of aluminium alloy castings is crucial to quality control. Aiming to address the challenges of limited samples and extensive computation in deeplearning-based surface defect detection for aluminium alloy castings, this paper proposes a surface defect detection method based on data enhancement and the Casting real-time DEtection TRansformer. First, to tackle the issue of small sample sizes and uneven distribution in surface defect data sets of aluminium alloy castings, ECA-MetaAconC deep Convolution Generative Adversarial Networks is proposed for generating defects with fewer samples and employ the image augmentation (IMGAUG) library for sample enhancement. Second, building upon the real-time DEtection TRansformer (RT-DETR), a lightweight partial-rep convolution is designed to decrease the network's parameter count. Simultaneously, the Deformable attention module and the DRBC3 module are introduced to enhance the neck network, thereby improving the model's capability to capture information and enhancing its detection performance. Compared to RT-DETR, this method reduces the number of model parameters by 38.7%, increases mAP by 1.5%, and achieves a frame rate that is 1.58 times higher than the original model. The experimental results demonstrate that this method can effectively and accurately detect surface defects in aluminium alloy castings, satisfying industrial requirements. The surface quality of aluminium alloy castings is very important in quality control. Aiming at the problems of few samples, large amount of calculation, and poor real-time performance in the surface defect detection method of aluminium alloy castings based on deeplearning, here a surface defect detection method based on data enhancement and Casting real-time DEtection TRansformer is designed. image
Robotic systems employed in tasks such as navigation, target tracking, security, and surveillance often use camera gimbal systems to enhance their monitoring and security capabilities. These camera gimbal systems unde...
详细信息
Robotic systems employed in tasks such as navigation, target tracking, security, and surveillance often use camera gimbal systems to enhance their monitoring and security capabilities. These camera gimbal systems undergo fast to-and-fro rotational motion to surveil the extended field of view (FOV). A high steering rate (rotation angle per second) of the gimbal is essential to revisit a given scene as fast as possible, which results in significant motion blur in the captured video frames. real-time motion deblurring is essential in surveillance robots since the subsequent image-processing tasks demand immediate availability of blur-free images. Existing deeplearning (DL) based motion deblurring methods either lack real-time performance due to network complexity or suffer from poor deblurring quality for large motion blurs. In this work, we propose a Gyro-guided Network for real-time motion deblurring (GRNet) which makes effective use of existing prior information to improve deblurring without increasing the complexity of the network. The steering rate of the gimbal is taken as a prior for data generation. A contrastive learning scheme is introduced for the network to learn the amount of blur in an image by utilizing the knowledge of blur content in images during training. To the GRNet, a sharp reference image is additionally given as input to guide the deblurring process. The most relevant features from the reference image are selected using a cross-attention module. Our method works in real-time at 30 fps. As a first, we propose a Gimbal Yaw motion real-wOrld (GYRO) dataset of infrared (IR) as well as color images with significant motion blur along with the inertial measurements of camera rotation, captured by a gimbal-based imaging setup where the gimbal undergoes rotational yaw motion. Both qualitative and quantitative evaluations on our proposed GYRO dataset, demonstrate the practical utility of our method.
Aiming at the planning and navigation needs in lung disease surgery, and the time-consuming and laborious manual labeling of lung CT arterial segmentation, a 3DUNet architecture neural network incorporating CBAM atten...
详细信息
ISBN:
(数字)9798350361643
ISBN:
(纸本)9798350361650
Aiming at the planning and navigation needs in lung disease surgery, and the time-consuming and laborious manual labeling of lung CT arterial segmentation, a 3DUNet architecture neural network incorporating CBAM attention mechanism is proposed for fast extraction of lung arterial veins. Based on the 2DUNet architecture, the dimension of the processed image is upgraded from two-dimensional to three-dimensional 3DUNet, and the CBAM attention module is added before pooling or up-convolution at the end of each convolution layer, and the parameter update is performed by using the Focal Loss loss function combined with stochastic gradient descent method. A three-fold cross-validation was performed using 95 lung CT data and ablation experiments were performed. After experimental validation, the model with the addition of CBAM attention module and Focal Loss loss function showed significant improvement in segmentation metrics, in which the accuracy, sensitivity, Dice coefficient, and precision for arteries and veins reached 99.8% and 99.8%, 84.9% and 87.2%, 82.5% and 84.8%, 81.1% and 83.5%, respectively.
With the growing popularity of fitness, the demand for real-time action recognition and feedback is increasing. Current research faces challenges in handling complex actions, real-timeprocessing, and system integrati...
详细信息
With the growing popularity of fitness, the demand for real-time action recognition and feedback is increasing. Current research faces challenges in handling complex actions, real-timeprocessing, and system integration. address these issues, we propose a novel fitness action recognition model that integrates ResNet, Transformer, and transfer learning techniques. Specifically, ResNet is used for image feature extraction, Transformer handles time-series data processing, and transfer learning accelerates the model's adaptation to new data. We evaluated our model on the NTU RGB+D action recognition dataset, achieving 48.5 ms latency, 29.1 fps throughput, and 93.7% accuracy, significantly outperforming other models. Our model achieved an accuracy improvement 5% over existing methods, demonstrating significant potential for real-time fitness monitoring. By incorporating IoT technology, our system enables real-time data processing and action recognition, making it ideal for smart fitness monitoring. Although the model has high complexity and memory usage, its efficiency and accuracy demonstrate its potential for widespread adoption. Future work will focus on optimizing the model structure and training methods to enhance applicability in resource-constrained environments, ensuring broader usability and efficiency in various real-world applications.
In the era of rapidly expanding image data, the demand for improved image compression algorithms has grown significantly, particularly with the integration of deeplearning approaches into traditional imageprocessing...
详细信息
ISBN:
(纸本)9781510673854;9781510673847
In the era of rapidly expanding image data, the demand for improved image compression algorithms has grown significantly, particularly with the integration of deeplearning approaches into traditional imageprocessing tasks. However, many of the existing solutions in this domain are burdened by computational complexity, rendering them unsuitable for real-time deployment on standard devices as they often necessitate complex systems and substantial energy consumption. This work addresses the growing paradigm of edge computing for real-time applications by introducing a novel, on-edge device solution. This innovative approach aims to strike a balance between efficiency and accuracy, adhering to the practical constraints of real-world deployment. By presenting demonstrations of the proposed solution's performance on readily available devices, we provide tangible evidence of its applicability and viability in real-world scenarios. This advance contributes to the ongoing dialogue about the need for accessible and efficient image compression algorithms that can be deployed real-time applications on edge devices, bridging the gap between the demanding computational requirements of deeplearning and the practical limitations of everyday hardware. As data continues to surge, solutions like this become ever more critical in ensuring effective image compression, aligning with on-edge computing within AI. This research paves the way for improved imageprocessing in real-time applications while conserving computational resources and energy consumption.
The integration of artificial intelligence (AI) and deeplearning heralds a transformative era in pattern recognition and computer vision, notably in image style transfer. We introduce the hierarchical dynamic multi-a...
详细信息
The integration of artificial intelligence (AI) and deeplearning heralds a transformative era in pattern recognition and computer vision, notably in image style transfer. We introduce the hierarchical dynamic multi-attention cycle generative adversarial network (HDMA-CGAN), an innovative deeplearning architecture poised to redefine image style transfer capabilities. HDMA-CGAN employs a novel multi-attention mechanism and color optimization strategies, enabling precise style replication with improved fidelity and vibrancy. Our model surpasses existing benchmarks in image quality, validated by leading metrics such as peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and Fr & eacute;chet inception distance (FID). Although HDMA-CGAN advances the state of the art, it necessitates high computational resources and faces challenges with very high-resolution images. Future work could explore optimizing the model's efficiency for real-time applications and extending its application to video content. This work enhances the tools available for visual content creation and digital media enhancement, leveraging advanced pattern recognition and AI techniques to significantly impact computer vision and imageprocessing.
In recent years,the Internet of Things(IoT)has gradually developed applications such as collecting sensory data and building intelligent services,which has led to an explosion in mobile data ***,with the rapid develop...
详细信息
In recent years,the Internet of Things(IoT)has gradually developed applications such as collecting sensory data and building intelligent services,which has led to an explosion in mobile data ***,with the rapid development of artificial intelligence,semantic communication has attracted great attention as a new communication ***,for IoT devices,however,processingimage information efficiently in realtime is an essential task for the rapid transmission of semantic *** the increase of model parameters in deeplearning methods,the model inference time in sensor devices continues to *** contrast,the Pulse Coupled Neural Network(PCNN)has fewer parameters,making it more suitable for processingreal-time scene tasks such as image segmentation,which lays the foundation for real-time,effective,and accurate image ***,the parameters of PCNN are determined by trial and error,which limits its *** overcome this limitation,an Improved Pulse Coupled Neural Networks(IPCNN)model is proposed in this *** IPCNN constructs the connection between the static properties of the input image and the dynamic properties of the neurons,and all its parameters are set adaptively,which avoids the inconvenience of manual setting in traditional methods and improves the adaptability of parameters to different types of *** segmentation results demonstrate the validity and efficiency of the proposed self-adaptive parameter setting method of IPCNN on the gray images and natural images from the Matlab and Berkeley Segmentation *** IPCNN method achieves a better segmentation result without training,providing a new solution for the real-time transmission of image semantic information.
Recently, deeplearning methodologies have achieved significant advancements in mineral automatic sorting and anomaly detection. However, the limited features of minerals transported in the form of small particles pos...
详细信息
Recently, deeplearning methodologies have achieved significant advancements in mineral automatic sorting and anomaly detection. However, the limited features of minerals transported in the form of small particles pose significant challenges to accurate detection. To address this challenge, we propose a enhanced mineral particle detection algorithm based on the YOLOv8s model. Initially, a C2f-SRU block is introduced to enable the feature extraction network to more effectively process spatial redundant information. Additionally, we designed the GFF module with the aim of enhancing information propagation between non-adjacent scale features, thereby enabling deep networks to more fully leverage spatial positional information from shallower networks. Finally, we adopted the Wise-IoU loss function to optimize the detection performance of the model. We also re-designed the position of the prediction heads to achieve precise detection of small-scale targets. The experimental results substantiate the effectiveness of the algorithm, with YOLO-Global achieving a mAP@.5 of 95.8%. In comparison to the original YOLOv8s, the improved model exhibits a 2.5% increase in mAP, achieving a model inference speed of 81 fps, meeting the requirements for real-timeprocessing and accuracy.
Near-field multiple-input multiple-output (MIMO) radar imaging systems have recently gained significant attention. These systems generally reconstruct the three-dimensional (3D) complex-valued reflectivity distributio...
详细信息
Near-field multiple-input multiple-output (MIMO) radar imaging systems have recently gained significant attention. These systems generally reconstruct the three-dimensional (3D) complex-valued reflectivity distribution of the scene using sparse measurements. Consequently, imaging quality highly relies on the image reconstruction approach. Existing analytical reconstruction approaches suffer from either high computational cost or low image quality. In this paper, we develop novel non-iterative deeplearning-based reconstruction methods for real-time near-field MIMO imaging. The goal is to achieve high image quality with low computational cost at compressive settings. The developed approaches have two stages. In the first approach, physics-based initial stage performs adjoint operation to back-project the measurements to the image-space, and deep neural network (DNN)-based second stage converts the 3D backprojected measurements to a magnitude-only reflectivity image. Since scene reflectivities often have random phase, DNN processes directly the magnitude of the adjoint result. As DNN, 3D U-Net is used to jointly exploit range and cross-range correlations. To comparatively evaluate the significance of exploiting physics in a learning-based approach, two additional approaches that replace the physics-based first stage with fully connected layers are also developed as purely learning based methods. The performance is also analyzed by changing the DNN architecture for the second stage to include complex-valued processing (instead of magnitude-only processing), 2D convolution kernels (instead of 3D), and ResNet architecture (instead of U-Net). Moreover, we develop a synthesizer to generate large-scale dataset for training the neural networks with 3D extended targets. We illustrate the performance through experimental data and extensive simulations. The results show the effectiveness of the developed physics based learned reconstruction approach compared to commonly used ap
Disease detection in agricultural crops plays a pivotal role in ensuring food security and sustainable farming practices. deeplearning models, known for their ability in image analysis, often demand extensive image d...
详细信息
暂无评论