检索结果-内蒙古大学图书馆

Optimal matching measurement of AI based field surveys using deep learning network and smart monitoring

SMART STRUCTURES AND SYSTEMS 2025年第1期35卷 39-51页

作者： Cho, Ying-Chiang Hung, C. C. Minnan Normal Univ Sch Phys & Informat Engn Zhangzhou Fujan Peoples R China Fuzhou Univ Int Studies & Trade Sch Big Data Fuzhou Fujan Peoples R China

This research introduces an innovative method for targetless displacement measurement of reinforced soil retaining walls, employing an optimal AI deep learning network in conjunction with advanced smart monitoring technologies. Conventional displacement measurement techniques often rely on physical targets, which can introduce inaccuracies and complicate real-time internet big data collection. Our approach eliminates the need for these targets by utilizing a AI deep learning framework that processes high-dimensional sensor data to accurately detect and quantify displacements by digital platform. By optimizing the AI deep learning network architecture, we enhance the model's ability to learn complex patterns associated with soil-structure interactions with AI knowledge management. Field experiments validate the efficacy of our method, demonstrating significant improvements in measurement precision and responsiveness. The findings indicate that this targetless technique not only streamlines the monitoring process but also provides critical insights into the dynamic behavior of AI based field surveys under varying environmental and load conditions. This advancement has substantial implications for the design, safety, and maintenance based on geotechnical infrastructures.

关键词： AI knowledge management computer-aided internet big data simulation convolutional neural networks deep learning neural network digital image processing image matching remote sensing and monitoring vision technology

来源：评论

学校读者我要写书评

暂无评论

FieldNet: Efficient real-time shadow removal for enhanced vision in field robotics

引用

EXPERT SYSTEMS WITH APPLICATIONS 2025年 279卷

作者： Saleh, Alzayat Olsen, Alex Wood, Jake Philippa, Bronson Azghadi, Mostafa Rahimi James Cook Univ Coll Sci & Engn Townsville Qld Australia AutoWeed Pty Ltd Townsville Qld Australia James Cook Univ ARC Res Hub Super charging Trop Aquaculture Genet Townsville Qld Australia

Shadows significantly hinder computer vision tasks in outdoor environments, particularly in field robotics, where varying lighting conditions complicate object detection and localization. We present FieldNet, a novel deep learning framework for real-time shadow removal, optimized for resource-constrained hardware. FieldNet introduces a probabilistic enhancement module and a novel loss function to address challenges of inconsistent shadow boundary supervision and artefact generation, achieving enhanced accuracy and simplicity without requiring shadow masks during inference. Trained on a dataset of 10,000 natural images augmented with synthetic shadows, FieldNet outperforms state-of-the-art methods on benchmark datasets (ISTD, ISTD+, SRD), with up to 9x speed improvements (66 FPS on Nvidia 2080Ti) and superior shadow removal quality (PSNR: 38.67, SSIM: 0.991). real-world case studies in precision agriculture robotics demonstrate the practical impact of FieldNet in enhancing weed detection accuracy. These advancements establish FieldNet as a robust, efficient solution for real-time vision tasks in field robotics and beyond.

关键词： Shadow removal Unpaired data real-time image processing deep learning Field robotics

来源：评论

学校读者我要写书评

暂无评论

MDSCN: multiscale depthwise separable convolutional network for underwater graphics restoration

引用

VISUAL COMPUTER 2025年第3期41卷 1999-2010页

作者： Li, Shiyu Liu, Zehao Gao, Meijing Bai, Yang Yin, Haozheng Yanshan Univ Coll Informat Sci & Engn Key Lab Special Fiber & Fiber Sensor Hebei Prov Qinhuangdao 066004 Hebei Peoples R China Beijing Inst Technol Coll Informat & Elect Beijing 100081 Peoples R China Beijing Inst Technol Tangshan Res Inst Tangshan 063000 Peoples R China

Underwater imaging techniques have been a focus of research for computer vision. Underwater imaging frequently encounters challenges for poor image quality and slow restoration speed, thereby hindering human underwater exploration endeavors. To enhance the quality and improve the real-time performance of underwater image restoration, the paper proposes a lightweight underwater color image restoration network based on multiscale depthwise separable convolution. First, the algorithm tackles the problems of difficult convergence and slow training by improving the AdamW optimizer. Then, we propose a multiscale depthwise separable convolution module with RGB channel, which allows efficient extraction of image features based on the underwater light propagation properties. The MDSCN can effectively improve the processing speed and recovery effect of underwater images. Through experimentation and analysis, our algorithm outperforms traditional image processing methods and recent deep learning approaches in terms of visual effects and objective evaluation metrics. Furthermore, our algorithm also has a better performs than existing deep learning methods in processing speed, which demonstrates excellent generalizability and practicality. The research in the article is highly informative for the field of underwater computer vision. The dataset, training weights files and codes are publicly available https://***/raining-li/underwater-image-processing/tree/master.

关键词： Computer graphics Underwater image restoration Optimizer Multiscale convolution Depth separable convolution

来源：评论

学校读者我要写书评

暂无评论

Rocknet: lightweight network for real-time segmentation of Martian rocks

引用

JOURNAL OF real-time image processing 2025年第1期22卷 1-11页

作者： Wei, Pengfei Sun, Zezhou Tian, He Jilin Univ Sch Mech & Aerosp Engn Changchun 130025 Peoples R China Beijing Inst Spacecraft Syst Engn Beijing 100094 Peoples R China Beijing Key Lab Intelligent Space Robot Syst Techn Beijing 100094 Peoples R China

Rock segmentation on the Martian is particularly critical for rover navigation, obstacle avoidance, and scientific target detection. We propose a lightweight network for real-time semantic segmentation of Martian rocks (RockNet). First, we propose the cross-dimension channel attention (CDCA) model to replace traditional downsample and upsample operation, which gives more weight to the channels with more useful information by adjusting the weight of each channel. Second, we modify the short-term dense concatenate model, we adopt dilated convolution to learn the feature with a larger receptive field, and through the skip connection structure, the degradation of the network can be reduced. Finally, we propose a feature fusion module (FFM) to fully fuse different levels of features. With only 0.86M parameters, our model gets 82.37% mIoU and 105.7 FPS running speed on the dataset of TWMARS.

关键词： Rock segmentation deep learning Lightweight network Encoder-decoder

来源：评论

学校读者我要写书评

暂无评论

TraSFlow: learning traditional optical flow proposal and segmentation for optical flow estimation improvement

引用

SIGNAL image AND VIDEO processing 2025年第6期19卷 1-9页

作者： Ammar, Anis Ghozzi, Rim Souani, Chokri Sousse Univ Higher Inst Appl Sci & Technol Sousse Tunisia Fac Sci Monastir Elect & Microelect Lab Monastir Tunisia

Optical flow estimation has evolved widely, in the last decade. Several methods have been developed to achieve accurate and robust motion estimation under diverse and complex scenarios. From traditional methods to deep learning approaches, researchers aim to enhance accuracy and adaptability. This paper, introduces a novel hybrid approach that integrates deep optical flow techniques, deep segmentation and builds upon our previously proposed method for optical flow estimation. Consequently, we leverage segmentation for the optimal separation of distinct regions, and our conventional method ensures improved precision, especially in the case of simple and small movements. The execution of our proposed method, 'TrasFlow', is conducted on the Jetson Xavier NX development kit. We validate the effectiveness of our joint training program through evaluation studies adapted for the estimation of optical flow through different datasets. Our proposed method, TraSFlow, achieves significant accuracy improvement over baseline models, with an end-point error (EPE) of 4.76 in the final Sintel pass and an F1-all score of 8.23% on KITTI 2015, while maintaining real-time performance of 32 frames per second on embedded systems. These results outperform the baseline model, highlighting TrasFlow's accuracy in various scenarios and demonstrating its superiority over many state-of-the-art methods.

关键词： Motion estimation Optical flow Segmentation real-time system deepFlow

来源：评论

学校读者我要写书评

暂无评论

Enhancing vehicle fault diagnosis through multi-view sound analysis: integrating scalograms and spectrograms in a deep learning framework

引用

SIGNAL image AND VIDEO processing 2025年第1期19卷 1-19页

作者： Akbalik, Ferit Yildiz, Abdulnasir Ertugrul, Omer Faruk Zan, Hasan Batman Univ Social Sci Vocat Sch Batman Turkiye Dicle Univ Dept Elect & Elect Engn Diyarbakir Turkiye Batman Univ Dept Elect & Elect Engn Batman Turkiye Mardin Artuklu Univ Dept Comp Engn Mardin Turkiye

This study presents a comprehensive framework for vehicle fault diagnosis using engine sound signals, leveraging deep learning models and a multi-view approach. Traditional methods for vehicle fault diagnosis often rely on the expertise of mechanics or diagnostic tools, which can be costly, time-consuming, and may not always provide accurate results. To address these limitations, we propose CarFaultNet, a multi-view model that processes both scalograms and spectrograms simultaneously to capture complementary information from these time-frequency representations. Our approach incorporates transfer learning with pretrained convolutional neural networks, including AlexNet, GoogLeNet, ShuffleNet, SqueezeNet, and MobileNet v2, as well as CarFaultNet, which combines two MobileNet networks. The results demonstrate that CarFaultNet outperforms traditional machine learning methods and single-view deep learning models, achieving a precision of 95.32%, recall of 94.83%, F1-score of 94.99%, and accuracy of 95.00%. Class activation mapping visualizations provide valuable insights into the model's decision-making process, highlighting the regions of the input images that are most influential for the classification of different vehicle faults. By leveraging a large, diverse dataset encompassing various vehicle models and real-world operating conditions, our approach addresses the drawbacks of previous studies and demonstrates the potential of deep learning for practical and effective vehicle fault diagnosis.

关键词： Vehicle fault detection Pretrained models MobileNet Engine sound Scalogram Spectrogram, Class activation mapping

来源：评论

学校读者我要写书评

暂无评论

Dynamic Tactical image Recognition and Analysis in Football Matches Using Convolutional Neural Networks

引用

TRAITEMENT DU SIGNAL 2025年第1期42卷 583-592页

作者： Xie, Qi Baoji Univ Arts & Sci Sch Phys Educ Baoji 721000 Peoples R China

With the increasing complexity of modern football tactics, how to intelligently and accurately analyze tactical changes in real-time during matches has become an important research direction. Traditional manual tactical analysis methods are inefficient and susceptible to subjective bias. Therefore, using computer vision and deep learning technologies for tactical image recognition and analysis in football matches has gradually become a research hotspot. Convolutional Neural Networks (CNNs), as a powerful image processing tool, have been widely applied in video analysis and player detection. However, multi-target motion prediction and tracking management in dynamic football match scenes still face significant challenges. Existing research mainly focuses on static image analysis or simple player tracking, but the high-frequency image updates, player interactions, and occlusion issues in football matches complicate multi-target tracking. While some deep learning-based methods for multi-target detection and tracking have made progress, challenges remain, such as handling high-density player targets and improving motion trajectory prediction accuracy. To address these shortcomings, this study proposes two core techniques based on CNNs: first, multi-target motion prediction, which accurately forecasts players' future positions based on historical motion data;second, multi-target tracking management, which uses deep learning to track and manage each player's movement trajectory in real-time. Through these two techniques, this research aims to improve the realtime and accuracy of tactical analysis in football matches, providing coaches and analysts with more scientific and efficient tactical decision-making support.

关键词： CNN football matches dynamic tactical image multi-target motion prediction multi-target tracking management computer vision

来源：评论

学校读者我要写书评

暂无评论

PSC-YOLO: a lightweight model for urban road instance segmentation

引用

JOURNAL OF real-time image processing 2025年第2期22卷 1-13页

作者： Gu, Xiaolin Zhang, Guofeng Dalian Jiaotong Univ Dalian Peoples R China

real-time instance segmentation in urban environments remains a critical challenge for autonomous driving systems, where occluded objects, cluttered backgrounds, and dynamic scales demand both high accuracy and computational efficiency. Traditional methods often sacrifice precision for speed or vice versa, failing to address the dual demands of urban scene understanding. Motivated by the need to bridge this gap, we propose PSC-YOLO, a lightweight framework driven by two core design principles: (1) enhancing multi-scale feature learning to resolve occlusion ambiguities and (2) enabling real-time interaction without compromising segmentation quality. Simultaneously, inspired by the adaptability of the Segment Anything Model (SAM), we streamline its mask decoding via architectural, enabling efficient pixel-level reasoning crucial for real-time urban perception. Experiments on urban road datasets demonstrate that PSC-YOLO outperforms YOLOv8n-seg by 2.0% in mask average precision while operating at 91 FPS-4 x\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times$$\end{document} faster than FastSAM. This work prioritizes the intrinsic requirements of urban perception systems: balancing precision for safety-critical tasks and speed for real-time decision-making, thereby advancing deployable solutions for autonomous vehicles and smart city infrastructure.

关键词： Automatic driving deep learning Instance segmentation Interactive segmentation YOLOv8

来源：评论

学校读者我要写书评

暂无评论

Catalysing assistive solutions by deploying light-weight deep learning model on edge devices

引用

JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE 2025年第3期37卷 465-486页

作者： Manjari, Kanak Verma, Madhushi Singal, Gaurav Chamola, Vinay Bennett Univ Dept Comp Sci Engn Greater Noida India NSUT Dept Comp Sci & Engn Delhi India BITS Dept Elect & Elect Engn Pilani India

Nowadays, real-time object detection, which is a crucial task, is being performed through image processing and deep learning techniques. As there are several high-performance computing edge devices available, selecting the best-fit device for a particular problem is a tough task and keeping in mind the cost, performance, and weight of the device in mind. One faces several challenges while performing this task in real-time such as a lack of resources in terms of power and mobility. We have provided an insight into the computation power of devices in terms of Frames per Second (FPS) by deploying object detection models on them. This paper will provide insight into selecting the appropriate combination of device and object detection models for real-time applications. Raspberry Pi 3 (RPi3), Raspberry Pi 4 (RPi4), Intel Neural Compute Stick 2 (NCS2), and Nvidia Jetson NANO are popular devices with high computation power used for real-time applications. The memory constraints of devices along with the deployment of different You Only Look Once (YOLO) and Single-Shot Detector (SSD) are the two object detection models that have been explained in this paper. A deep learning inference optimiser, TensorRT, has been used in NANO to achieve high throughput in the performance of object detection. The precision, recall, and F1 score achieved on deploying each tested model have been presented. After observing the devices during experimentation, RPi4+NCS2 showed the best execution with the blend of factors i.e. speed, portability, and user-friendliness.

关键词： Assistive technology edge devices raspberry neural compute stick jetson NANO

来源：评论

学校读者我要写书评

暂无评论

Joint Spatial and Frequency Domain learning for Lightweight Spectral image Demosaicing

引用

IEEE TRANSACTIONS ON image processing 2025年 34卷 1119-1132页

作者： Wu, Fangfang Huang, Tao Xu, Junwei Cao, Xun Dong, Weisheng Dong, Le Shi, Guangming Xidian Univ Sch Comp Sci & Technol Xian 710071 Peoples R China Xidian Univ Sch Artificial Intelligence Xian 710071 Peoples R China Yealink Xiamen 361009 Peoples R China Nanjing Univ Sch Elect Sci & Engn Nanjing 210093 Peoples R China

Conventional spectral image demosaicing algorithms rely on pixels' spatial or spectral correlations for reconstruction. Due to the missing data in the multispectral filter array (MSFA), the estimation of spatial or spectral correlations is inaccurate, leading to poor reconstruction results, and these algorithms are time-consuming. deep learning-based spectral image demosaicing methods directly learn the nonlinear mapping relationship between 2D spectral mosaic images and 3D multispectral images. However, these learning-based methods focused only on learning the mapping relationship in the spatial domain, but neglected valuable image information in the frequency domain, resulting in limited reconstruction quality. To address the above issues, this paper proposes a novel lightweight spectral image demosaicing method based on joint spatial and frequency domain information learning. First, a novel parameter-free spectral image initialization strategy based on the Fourier transform is proposed, which leads to better initialized spectral images and eases the difficulty of subsequent spectral image reconstruction. Furthermore, an efficient spatial-frequency transformer network is proposed, which jointly learns the spatial correlations and the frequency domain characteristics. Compared to existing learning-based spectral image demosaicing methods, the proposed method significantly reduces the number of model parameters and computational complexity. Extensive experiments on simulated and real-world data show that the proposed method notably outperforms existing spectral image demosaicing methods.

关键词： Frequency-domain analysis Imaging image reconstruction Transformers Feature extraction Correlation Fourier transforms Interpolation Three-dimensional displays Optimization Spectral image demosaicing multispectral filter array (MSFA) deep learning spatial domain frequency domain transformer

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：