Robotic systems employed in tasks such as navigation, target tracking, security, and surveillance often use camera gimbal systems to enhance their monitoring and security capabilities. These camera gimbal systems unde...
详细信息
Robotic systems employed in tasks such as navigation, target tracking, security, and surveillance often use camera gimbal systems to enhance their monitoring and security capabilities. These camera gimbal systems undergo fast to-and-fro rotational motion to surveil the extended field of view (FOV). A high steering rate (rotation angle per second) of the gimbal is essential to revisit a given scene as fast as possible, which results in significant motion blur in the captured video frames. real-time motion deblurring is essential in surveillance robots since the subsequent image-processing tasks demand immediate availability of blur-free images. Existing deeplearning (DL) based motion deblurring methods either lack real-time performance due to network complexity or suffer from poor deblurring quality for large motion blurs. In this work, we propose a Gyro-guided Network for real-time motion deblurring (GRNet) which makes effective use of existing prior information to improve deblurring without increasing the complexity of the network. The steering rate of the gimbal is taken as a prior for data generation. A contrastive learning scheme is introduced for the network to learn the amount of blur in an image by utilizing the knowledge of blur content in images during training. To the GRNet, a sharp reference image is additionally given as input to guide the deblurring process. The most relevant features from the reference image are selected using a cross-attention module. Our method works in real-time at 30 fps. As a first, we propose a Gimbal Yaw motion real-wOrld (GYRO) dataset of infrared (IR) as well as color images with significant motion blur along with the inertial measurements of camera rotation, captured by a gimbal-based imaging setup where the gimbal undergoes rotational yaw motion. Both qualitative and quantitative evaluations on our proposed GYRO dataset, demonstrate the practical utility of our method.
In this rather informal paper and talk I will discuss my own experiences, feelings and evolution as an imageprocessing and Digital Video educator trying to navigate the deeplearning revolution. I will discuss my own...
详细信息
ISBN:
(纸本)9781728157450
In this rather informal paper and talk I will discuss my own experiences, feelings and evolution as an imageprocessing and Digital Video educator trying to navigate the deeplearning revolution. I will discuss my own ups and downs of trying to deal with extremely rapid technological changes, and how I reacted to, and dealt with consequent dramatic changes in the relevance of the topics I've taught for three decades. I arranged the discussion in terms of the stages, over time, of my progression dealing with these sea changes.
Tiny deeplearning Models offer many advantages in various applications. From the perspective of statistical machine learning theory the contributions of this paper is to complement the research advances and results o...
详细信息
Tiny deeplearning Models offer many advantages in various applications. From the perspective of statistical machine learning theory the contributions of this paper is to complement the research advances and results obtained so far in real-time 3D object recognition. We propose a Tiny deeplearning Model named Complementary Spatial Transformer Network (CSTN) for real-time 3D object recognition. It turns out that CSTN's working, and analysis are much simplified in a target space setting. We make algorithmic enhancements to perform CSTN computations faster and keep the learning part of CSTN in minimal size. Finally, we provide the experimental verifications of the results obtained in publicly available point cloud data sets ModelNet40 and ShapeNetCore with our model performing 1.65-2 times better in DPS (Detections/s) rate on GPU hardware for 3D object recognition, when compared to state-of-the-art networks. Complementary Spatial Transformer Network architecture requires only 10-35% of trainable parameters, when compared to state-of-the-art networks, making the network easier to deploy in edge AI devices.
A human motion monitoring method based on thermal radiation image system and target detection technology is developed. The heat distribution of human body in motion is captured by thermal image, and the real-time reco...
详细信息
A human motion monitoring method based on thermal radiation image system and target detection technology is developed. The heat distribution of human body in motion is captured by thermal image, and the real-time recognition and analysis of human motion is realized by combining imageprocessing and target detection algorithm. A complete set of thermal radiation optical image monitoring system is designed in this study. A highsensitivity thermal imaging camera is used to capture the thermal radiation images of the human body in the process of motion, and these images are then transmitted to the data acquisition unit for preliminary data collation and storage. The imageprocessing module preprocesses the acquired thermal images, and the preprocessed images are fed into the object detection algorithm, which is based on the deeplearning framework and can recognize and classify different movements of the human body. The thermal radiation image monitoring system can accurately capture the thermal image of the human body in different motion states, and identify the movement type of the athlete in realtime through the target detection algorithm. The system has a strong ability to capture the details of actions, and can identify the beginning, progress and end stages of actions. Compared with traditional monitoring methods, the thermal radiation light image monitoring system has obvious advantages in terms of data accuracy and real-time performance. This method can not only provide high precision movement recognition, but also has the advantages of non-contact and real-time monitoring, which greatly improves the efficiency and accuracy of sports training monitoring.
Fused Deposition Modeling (FDM), an 3D printing technique being popular for rapidly fabricating polymeric prototypes as well as functional components with gradient structures such as scaffolds still faces significant ...
详细信息
Fused Deposition Modeling (FDM), an 3D printing technique being popular for rapidly fabricating polymeric prototypes as well as functional components with gradient structures such as scaffolds still faces significant hurdles in quality control and defect management. To overcome these limitations, a comprehensive approach has been proposed integrating advanced deeplearning models with an Internet of Things (IoT) based quality control system. The research proposes a framework using Data-efficient image Transformer (DeiT) model, engineered to identify and classify three high-impact FDM defects: warping, layer delamination, and gaps in raster lines. The model has been fine-tuned on a curated dataset of original images, enhanced through pre-processing techniques. The DeiT model combined with a proposed Weighted Classification Accuracy (WCA) approach achieves an accuracy of 99.3%. Furthermore, the response time of the entire system is calculated to be 0.1121 s, providing realtime monitoring and control. The research represents a significant step towards intelligent and optimized manufacturing systems in the context of Industry 4.0, addressing current challenges in FDM printing while paving the way for more autonomous and efficient 3D printing processes in the future.
Featured Application This work can be applied to enhance the robustness of image filtering systems in large-scale content platforms, specifically for detecting unauthorized images and their transformed versions, preve...
详细信息
Featured Application This work can be applied to enhance the robustness of image filtering systems in large-scale content platforms, specifically for detecting unauthorized images and their transformed versions, preventing the dissemination of manipulated *** image filtering systems have become essential in large-scale content platforms to prevent the dissemination of unauthorized data. While extensive research has focused on identifying images based on categories or visual similarity, the filtering problem addressed in this study presents distinct challenges. Specifically, it involves a predefined set of filtering images and requires real-time detection of whether a distributed image is derived from an unauthorized source. Although three major approaches-bitmap-based, imageprocessing-based, and deeplearning-based techniques-have been explored, no comprehensive comparison has been conducted. To bridge this gap, we formalize the concept of image equivalence and introduce performance metrics tailored for fair evaluation. Through extensive experiments, we derive the following key findings. First, bitmap-based methods are practically viable in real-world scenarios, offering reasonable detection rates and fast search speeds even under resource constraints. Second, despite their success in tasks such as image classification, deeplearning-based methods underperform in our problem domain, highlighting the need for customized models and architectures. Third, imageprocessing-based techniques demonstrate superior performance across all key metrics, including execution time and detection rates. These findings provide valuable insights into designing efficient image filtering systems for diverse content platforms, particularly for detecting unauthorized images and their transformations effectively.
In recent years, the proliferation of deepfake images has posed a substantial threat to media credibility, security, and privacy. Contemporary detection techniques, predominantly reliant on deeplearning algorithms, f...
详细信息
In recent years, the proliferation of deepfake images has posed a substantial threat to media credibility, security, and privacy. Contemporary detection techniques, predominantly reliant on deeplearning algorithms, fail to identify the nuanced pixel-level discrepancies inherent in deepfake material. This study introduces PlasmoVision, an innovative quantum-enhanced plasmonic imaging technology that incorporates AI-driven deeplearning for highly sensitive real-timedeepfake detection. deepfakes alter digital images and videos to produce very persuasive fraudulent content, rendering traditional detection methods ineffective. Plasmonic surface resonance technology, in conjunction with quantum dots, has the capacity to capture intricate image features that can disclose such alterations. Integrating deeplearning into this detection system improves the accuracy and velocity of analysis. The PlasmoVision technology employs quantum dot-enhanced plasmonic arrays to detect sub-pixel-level resonance shifts resulting from light interaction with the image surface. The optical signals are analyzed with a sophisticated convolutional neural network (CNN) that categorizes images according to the plasmonic resonance data. The AI model is trained on a varied dataset of genuine and deepfake photos, attaining an ideal equilibrium between detection sensitivity and speed. real-time picture analysis is accomplished by swift plasmonic scanning and AI-driven classification. The suggested device attained an accuracy rate of 98.6% in identifying deepfakes within a test dataset, exhibiting a false positive rate of 1.2% and a false negative rate of 0.5%. The quantum-enhanced plasmonic system identified pixel abnormalities with a sensitivity of up to 10 nm, markedly surpassing conventional deepfake detection technologies. PlasmoVision real-time analysis capacity decreased processingtime by 35% relative to traditional approaches, rendering it exceptionally appropriate for extensive and real-ti
Public health initiatives must be made using evidence-based decision-making to have the greatest impact. Machine learning algorithms are created to gather, store, process, and analyze data to provide knowledge and gui...
详细信息
Public health initiatives must be made using evidence-based decision-making to have the greatest impact. Machine learning algorithms are created to gather, store, process, and analyze data to provide knowledge and guide decisions. A crucial part of any surveillance system is image analysis. The communities of computer vision and machine learning have become curious about it as of late. This study uses a variety of machine learning, and imageprocessing approaches to detect and forecast malarial illness. In our research, we discovered the potential of deeplearning techniques as innovative tools with a broader applicability for malaria detection, which benefits physicians by assisting in the diagnosis of the condition. We investigate the common confinements of deeplearning for computer frameworks and organizing, including the requirement for data preparation, preparation overhead, real-time execution, and explaining ability, and uncover future inquiries about bearings focusing on these constraints.
This paper proposes a lightweight deeplearning (DL) framework for real-time accurate weld feature extraction from noisy images with light, smoke, or splash. Leveraging a two-dimensional human pose estimation paradigm...
详细信息
This paper proposes a lightweight deeplearning (DL) framework for real-time accurate weld feature extraction from noisy images with light, smoke, or splash. Leveraging a two-dimensional human pose estimation paradigm, the framework follows a top-down architecture for accurate weld feature point localization. This study develops a semi-automatic annotation technique to dramatically reduce the annotation cost. Then, we design a lightweight yet faster You Only Look Once version 8 (YOLOv8) detector to rapidly detect the weld feature region in the presence of strong noise. To avoid reliance on high-resolution feature maps and achieve sub-pixel-level localization accuracy, a heatmap-free approach decomposes the feature point detection task into subtasks of horizontal and vertical coordinate classification. Comparison with mainstream DL-based weld recognition methods validates the superiority of the proposed method regarding real-time feature extraction accuracy and robustness.
Near-field multiple-input multiple-output (MIMO) radar imaging systems have recently gained significant attention. These systems generally reconstruct the three-dimensional (3D) complex-valued reflectivity distributio...
详细信息
Near-field multiple-input multiple-output (MIMO) radar imaging systems have recently gained significant attention. These systems generally reconstruct the three-dimensional (3D) complex-valued reflectivity distribution of the scene using sparse measurements. Consequently, imaging quality highly relies on the image reconstruction approach. Existing analytical reconstruction approaches suffer from either high computational cost or low image quality. In this paper, we develop novel non-iterative deeplearning-based reconstruction methods for real-time near-field MIMO imaging. The goal is to achieve high image quality with low computational cost at compressive settings. The developed approaches have two stages. In the first approach, physics-based initial stage performs adjoint operation to back-project the measurements to the image-space, and deep neural network (DNN)-based second stage converts the 3D backprojected measurements to a magnitude-only reflectivity image. Since scene reflectivities often have random phase, DNN processes directly the magnitude of the adjoint result. As DNN, 3D U-Net is used to jointly exploit range and cross-range correlations. To comparatively evaluate the significance of exploiting physics in a learning-based approach, two additional approaches that replace the physics-based first stage with fully connected layers are also developed as purely learning based methods. The performance is also analyzed by changing the DNN architecture for the second stage to include complex-valued processing (instead of magnitude-only processing), 2D convolution kernels (instead of 3D), and ResNet architecture (instead of U-Net). Moreover, we develop a synthesizer to generate large-scale dataset for training the neural networks with 3D extended targets. We illustrate the performance through experimental data and extensive simulations. The results show the effectiveness of the developed physics based learned reconstruction approach compared to commonly used ap
暂无评论