This research introduces an innovative method for targetless displacement measurement of reinforced soil retaining walls, employing an optimal AI deeplearning network in conjunction with advanced smart monitoring tec...
详细信息
This research introduces an innovative method for targetless displacement measurement of reinforced soil retaining walls, employing an optimal AI deeplearning network in conjunction with advanced smart monitoring technologies. Conventional displacement measurement techniques often rely on physical targets, which can introduce inaccuracies and complicate real-time internet big data collection. Our approach eliminates the need for these targets by utilizing a AI deeplearning framework that processes high-dimensional sensor data to accurately detect and quantify displacements by digital platform. By optimizing the AI deeplearning network architecture, we enhance the model's ability to learn complex patterns associated with soil-structure interactions with AI knowledge management. Field experiments validate the efficacy of our method, demonstrating significant improvements in measurement precision and responsiveness. The findings indicate that this targetless technique not only streamlines the monitoring process but also provides critical insights into the dynamic behavior of AI based field surveys under varying environmental and load conditions. This advancement has substantial implications for the design, safety, and maintenance based on geotechnical infrastructures.
Shadows significantly hinder computer vision tasks in outdoor environments, particularly in field robotics, where varying lighting conditions complicate object detection and localization. We present FieldNet, a novel ...
详细信息
Shadows significantly hinder computer vision tasks in outdoor environments, particularly in field robotics, where varying lighting conditions complicate object detection and localization. We present FieldNet, a novel deeplearning framework for real-time shadow removal, optimized for resource-constrained hardware. FieldNet introduces a probabilistic enhancement module and a novel loss function to address challenges of inconsistent shadow boundary supervision and artefact generation, achieving enhanced accuracy and simplicity without requiring shadow masks during inference. Trained on a dataset of 10,000 natural images augmented with synthetic shadows, FieldNet outperforms state-of-the-art methods on benchmark datasets (ISTD, ISTD+, SRD), with up to 9x speed improvements (66 FPS on Nvidia 2080Ti) and superior shadow removal quality (PSNR: 38.67, SSIM: 0.991). real-world case studies in precision agriculture robotics demonstrate the practical impact of FieldNet in enhancing weed detection accuracy. These advancements establish FieldNet as a robust, efficient solution for real-time vision tasks in field robotics and beyond.
Underwater imaging techniques have been a focus of research for computer vision. Underwater imaging frequently encounters challenges for poor image quality and slow restoration speed, thereby hindering human underwate...
详细信息
Underwater imaging techniques have been a focus of research for computer vision. Underwater imaging frequently encounters challenges for poor image quality and slow restoration speed, thereby hindering human underwater exploration endeavors. To enhance the quality and improve the real-time performance of underwater image restoration, the paper proposes a lightweight underwater color image restoration network based on multiscale depthwise separable convolution. First, the algorithm tackles the problems of difficult convergence and slow training by improving the AdamW optimizer. Then, we propose a multiscale depthwise separable convolution module with RGB channel, which allows efficient extraction of image features based on the underwater light propagation properties. The MDSCN can effectively improve the processing speed and recovery effect of underwater images. Through experimentation and analysis, our algorithm outperforms traditional imageprocessing methods and recent deeplearning approaches in terms of visual effects and objective evaluation metrics. Furthermore, our algorithm also has a better performs than existing deeplearning methods in processing speed, which demonstrates excellent generalizability and practicality. The research in the article is highly informative for the field of underwater computer vision. The dataset, training weights files and codes are publicly available https://***/raining-li/underwater-image-processing/tree/master.
Rock segmentation on the Martian is particularly critical for rover navigation, obstacle avoidance, and scientific target detection. We propose a lightweight network for real-time semantic segmentation of Martian rock...
详细信息
Rock segmentation on the Martian is particularly critical for rover navigation, obstacle avoidance, and scientific target detection. We propose a lightweight network for real-time semantic segmentation of Martian rocks (RockNet). First, we propose the cross-dimension channel attention (CDCA) model to replace traditional downsample and upsample operation, which gives more weight to the channels with more useful information by adjusting the weight of each channel. Second, we modify the short-term dense concatenate model, we adopt dilated convolution to learn the feature with a larger receptive field, and through the skip connection structure, the degradation of the network can be reduced. Finally, we propose a feature fusion module (FFM) to fully fuse different levels of features. With only 0.86M parameters, our model gets 82.37% mIoU and 105.7 FPS running speed on the dataset of TWMARS.
Optical flow estimation has evolved widely, in the last decade. Several methods have been developed to achieve accurate and robust motion estimation under diverse and complex scenarios. From traditional methods to dee...
详细信息
Optical flow estimation has evolved widely, in the last decade. Several methods have been developed to achieve accurate and robust motion estimation under diverse and complex scenarios. From traditional methods to deeplearning approaches, researchers aim to enhance accuracy and adaptability. This paper, introduces a novel hybrid approach that integrates deep optical flow techniques, deep segmentation and builds upon our previously proposed method for optical flow estimation. Consequently, we leverage segmentation for the optimal separation of distinct regions, and our conventional method ensures improved precision, especially in the case of simple and small movements. The execution of our proposed method, 'TrasFlow', is conducted on the Jetson Xavier NX development kit. We validate the effectiveness of our joint training program through evaluation studies adapted for the estimation of optical flow through different datasets. Our proposed method, TraSFlow, achieves significant accuracy improvement over baseline models, with an end-point error (EPE) of 4.76 in the final Sintel pass and an F1-all score of 8.23% on KITTI 2015, while maintaining real-time performance of 32 frames per second on embedded systems. These results outperform the baseline model, highlighting TrasFlow's accuracy in various scenarios and demonstrating its superiority over many state-of-the-art methods.
This study presents a comprehensive framework for vehicle fault diagnosis using engine sound signals, leveraging deeplearning models and a multi-view approach. Traditional methods for vehicle fault diagnosis often re...
详细信息
This study presents a comprehensive framework for vehicle fault diagnosis using engine sound signals, leveraging deeplearning models and a multi-view approach. Traditional methods for vehicle fault diagnosis often rely on the expertise of mechanics or diagnostic tools, which can be costly, time-consuming, and may not always provide accurate results. To address these limitations, we propose CarFaultNet, a multi-view model that processes both scalograms and spectrograms simultaneously to capture complementary information from these time-frequency representations. Our approach incorporates transfer learning with pretrained convolutional neural networks, including AlexNet, GoogLeNet, ShuffleNet, SqueezeNet, and MobileNet v2, as well as CarFaultNet, which combines two MobileNet networks. The results demonstrate that CarFaultNet outperforms traditional machine learning methods and single-view deeplearning models, achieving a precision of 95.32%, recall of 94.83%, F1-score of 94.99%, and accuracy of 95.00%. Class activation mapping visualizations provide valuable insights into the model's decision-making process, highlighting the regions of the input images that are most influential for the classification of different vehicle faults. By leveraging a large, diverse dataset encompassing various vehicle models and real-world operating conditions, our approach addresses the drawbacks of previous studies and demonstrates the potential of deeplearning for practical and effective vehicle fault diagnosis.
With the increasing complexity of modern football tactics, how to intelligently and accurately analyze tactical changes in real-time during matches has become an important research direction. Traditional manual tactic...
详细信息
With the increasing complexity of modern football tactics, how to intelligently and accurately analyze tactical changes in real-time during matches has become an important research direction. Traditional manual tactical analysis methods are inefficient and susceptible to subjective bias. Therefore, using computer vision and deeplearning technologies for tactical image recognition and analysis in football matches has gradually become a research hotspot. Convolutional Neural Networks (CNNs), as a powerful imageprocessing tool, have been widely applied in video analysis and player detection. However, multi-target motion prediction and tracking management in dynamic football match scenes still face significant challenges. Existing research mainly focuses on static image analysis or simple player tracking, but the high-frequency image updates, player interactions, and occlusion issues in football matches complicate multi-target tracking. While some deeplearning-based methods for multi-target detection and tracking have made progress, challenges remain, such as handling high-density player targets and improving motion trajectory prediction accuracy. To address these shortcomings, this study proposes two core techniques based on CNNs: first, multi-target motion prediction, which accurately forecasts players' future positions based on historical motion data;second, multi-target tracking management, which uses deeplearning to track and manage each player's movement trajectory in real-time. Through these two techniques, this research aims to improve the realtime and accuracy of tactical analysis in football matches, providing coaches and analysts with more scientific and efficient tactical decision-making support.
real-time instance segmentation in urban environments remains a critical challenge for autonomous driving systems, where occluded objects, cluttered backgrounds, and dynamic scales demand both high accuracy and comput...
详细信息
real-time instance segmentation in urban environments remains a critical challenge for autonomous driving systems, where occluded objects, cluttered backgrounds, and dynamic scales demand both high accuracy and computational efficiency. Traditional methods often sacrifice precision for speed or vice versa, failing to address the dual demands of urban scene understanding. Motivated by the need to bridge this gap, we propose PSC-YOLO, a lightweight framework driven by two core design principles: (1) enhancing multi-scale feature learning to resolve occlusion ambiguities and (2) enabling real-time interaction without compromising segmentation quality. Simultaneously, inspired by the adaptability of the Segment Anything Model (SAM), we streamline its mask decoding via architectural, enabling efficient pixel-level reasoning crucial for real-time urban perception. Experiments on urban road datasets demonstrate that PSC-YOLO outperforms YOLOv8n-seg by 2.0% in mask average precision while operating at 91 FPS-4 x\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times$$\end{document} faster than FastSAM. This work prioritizes the intrinsic requirements of urban perception systems: balancing precision for safety-critical tasks and speed for real-time decision-making, thereby advancing deployable solutions for autonomous vehicles and smart city infrastructure.
Nowadays, real-time object detection, which is a crucial task, is being performed through imageprocessing and deeplearning techniques. As there are several high-performance computing edge devices available, selectin...
详细信息
Nowadays, real-time object detection, which is a crucial task, is being performed through imageprocessing and deeplearning techniques. As there are several high-performance computing edge devices available, selecting the best-fit device for a particular problem is a tough task and keeping in mind the cost, performance, and weight of the device in mind. One faces several challenges while performing this task in real-time such as a lack of resources in terms of power and mobility. We have provided an insight into the computation power of devices in terms of Frames per Second (FPS) by deploying object detection models on them. This paper will provide insight into selecting the appropriate combination of device and object detection models for real-time applications. Raspberry Pi 3 (RPi3), Raspberry Pi 4 (RPi4), Intel Neural Compute Stick 2 (NCS2), and Nvidia Jetson NANO are popular devices with high computation power used for real-time applications. The memory constraints of devices along with the deployment of different You Only Look Once (YOLO) and Single-Shot Detector (SSD) are the two object detection models that have been explained in this paper. A deeplearning inference optimiser, TensorRT, has been used in NANO to achieve high throughput in the performance of object detection. The precision, recall, and F1 score achieved on deploying each tested model have been presented. After observing the devices during experimentation, RPi4+NCS2 showed the best execution with the blend of factors i.e. speed, portability, and user-friendliness.
Conventional spectral image demosaicing algorithms rely on pixels' spatial or spectral correlations for reconstruction. Due to the missing data in the multispectral filter array (MSFA), the estimation of spatial o...
详细信息
Conventional spectral image demosaicing algorithms rely on pixels' spatial or spectral correlations for reconstruction. Due to the missing data in the multispectral filter array (MSFA), the estimation of spatial or spectral correlations is inaccurate, leading to poor reconstruction results, and these algorithms are time-consuming. deeplearning-based spectral image demosaicing methods directly learn the nonlinear mapping relationship between 2D spectral mosaic images and 3D multispectral images. However, these learning-based methods focused only on learning the mapping relationship in the spatial domain, but neglected valuable image information in the frequency domain, resulting in limited reconstruction quality. To address the above issues, this paper proposes a novel lightweight spectral image demosaicing method based on joint spatial and frequency domain information learning. First, a novel parameter-free spectral image initialization strategy based on the Fourier transform is proposed, which leads to better initialized spectral images and eases the difficulty of subsequent spectral image reconstruction. Furthermore, an efficient spatial-frequency transformer network is proposed, which jointly learns the spatial correlations and the frequency domain characteristics. Compared to existing learning-based spectral image demosaicing methods, the proposed method significantly reduces the number of model parameters and computational complexity. Extensive experiments on simulated and real-world data show that the proposed method notably outperforms existing spectral image demosaicing methods.
暂无评论