检索结果-内蒙古大学图书馆

Lightweight deep learning-based automated concrete surface inspection: Crack classification and pixel-level segmentation

引用

SMART STRUCTURES AND SYSTEMS 2025年第3期35卷 163-175页

作者： Yang, Dong Cai, Yuan-Yuan Xu, En-Dian Zhang, Jing Yuan, Ye Wang, Yan-Jia Guangzhou Univ Earthquake Engn Res & Test Ctr EERTC Guangzhou Peoples R China Guangzhou Univ Key Lab Earthquake Resistance Earthquake Mitigat & Struct Safety Minist Educ Guangzhou Peoples R China Hefei Univ Technol Dept Civil Engn Hefei Anhui Peoples R China Jinan Univ Dept Mech & Construct Engn Guangzhou Peoples R China Univ Hong Kong Dept Civil Engn Pokfulam Rd Hong Kong Peoples R China

Crack detection is an important measure in the field of structural health monitoring. However, visual crack detection is labor-intensive, time-consuming, inefficient, and expensive. Although image-based detection and processing provides an efficient way for structural crack detection, its accuracy depends on image quality. For engineering structures, especially bridges, the change of light conditions and the difference of surface characteristics of structural components pose a major challenge to traditional crack detection methods. In this paper, a novel crack detection method based on convolutional neural networks is proposed. The development of this method is divided into the following stages. The initial automated crack classification is carried out by using MobileNetV3, and then the improved deepLabv3+ network is used to segment the classified crack image semantically accurately. Finally, the real crack image is used for verification. To verify the proposed method, several conventional deep learning networks are trained and compared. The improved deepLabV3+ integrates MobileNetV3 as its feature extraction backbone and incorporates the convolutional block attention module, which achieves 87.79% average intersection and 93.87% average pixel accuracy on public and real data sets. Compared with traditional models such as VGG16, the proposed method shortens the training time by more than 80% while maintaining high detection accuracy. In addition, the compact parameter configuration and moderate model size make it particularly suitable for deployment on mobile detection devices.

关键词： attention mechanism crack detection improved deepLabv3+ lightweight deep learning semantic segmentation

来源：评论

学校读者我要写书评

暂无评论

Detection and Classification of Mental Stress Using In-Ear Plethysmography and a Vision Transformer

引用

IEEE SENSORS JOURNAL 2025年第2期25卷 4015-4027页

作者： Barki, Hika Nkenyereye, Lionel Chung, Wan-Young Pukyong Natl Univ Dept AI Convergence Busan 48513 South Korea Pukyong Natl Univ Dept Elect Engn Busan 48513 South Korea

This study addresses the critical need for effective mental stress monitoring, linked to severe health issues like depression and heart disease. We introduce a robust method using in-ear photoplethysmogram (PPG) signals for detecting and classifying stress levels. The objective of this study is to develop a precise stress monitoring technique using advanced signal processing and deep learning. Raw PPG data were collected from 15 subjects undergoing stress-inducing activities in a controlled setting. The data underwent preprocessing and were transformed into image-like time-frequency representations. We employed vision transformer (ViT) models for classification, which were fine-tuned and compared against other state-of-the-art deep learning models. The ViT classifier significantly outperformed existing models, achieving an average accuracy of 97.78% and an F1-score of 97.79%. While the dataset is relatively small, these results suggest a promising direction for stress monitoring by illustrating the potential of combining in-ear PPG signals with ViT models. The study indicates the efficacy of this novel approach for accurate mental stress diagnosis, which could have significant implications for mental health applications. Future work will focus on validating these findings with a larger sample size and exploring the integration of this technology into wearable devices for real-world stress monitoring.

关键词： Human factors Sensors Accuracy Monitoring Hardware Biomedical monitoring Heart rate variability Transformers real-time systems deep learning Classification filtering mental stress photoplethysmogram (PPG) time-frequency analysis vision transformer (ViT)

来源：评论

学校读者我要写书评

暂无评论

Intelligent Scheduling for Group Distributed Manufacturing Systems: Harnessing deep Reinforcement learning in Cloud-Edge Cooperation

引用

IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE 2024年第2期8卷 1687-1698页

作者： Guo, Peng Xiong, Jianyu Wang, Yi Meng, Xiangyin Qian, Linmao Southwest Jiaotong Univ Sch Mech Engn Chengdu 610032 Peoples R China Auburn Univ Dept Math Montgomery AL 36117 USA

Cloud-edge technology enables near-real-time optimization of production lines in group-distributed manufacturing systems. Offloading some tasks to the cloud and processing the remaining tasks on the edge side can improve efficiency of the production optimization. However, due to the complexity of the manufacturing environment and various constraints, an effective offloading strategy is crucial to reduce computing delays and minimize transmission requirements for large-scale optimization requirements. This paper proposes a mixed-integer programming model and a deep reinforcement learning (DRL) framework, based on a Transformer, to address the cloud-edge offloading problem. The DRL framework consists of an encoder and decoder, designed using Transformer. Task offloading decisions are translated into two options: cloud offloading or edge retention. The encoder extracts relevant features for each option, and the decoder generates the probability of selecting each option based on the encoded information. Extensive computational experiments demonstrate the effectiveness of the proposed framework in solving the task offloading problem with time windows, achieving near-real-time optimization of production lines within competitive computational time.

关键词： Group distributed manufacturing cloud-edge collaboration task offloading deep reinforcement learning transformer

来源：评论

学校读者我要写书评

暂无评论

real-time deconvolution of light fields through pixel selection in the point-spread-function and direct inversion of the image formation process

引用

APPLIED OPTICS 2025年第3期64卷 578-587页

作者： Eckstein, Viktor Schmid-schirling, Tobias Carl, Daniel Wallrabe, Ulrike Fraunhofer Inst Phys Measurement Tech IPM Freiburg Germany Univ Freiburg Dept Microsyst Engn IMTEK Lab Microactuators Freiburg Germany

This paper explores the optimization of light field deconvolution, a key process in image processing that reconstructs a 3D object space or a 2D refocus plane from a light field. Despite the critical role of deconvolution in light field technology, existing methods are often slow, computationally intensive, and unsuitable for real-time processing. Existing algorithms, such as the Richardson-Lucy approach, while groundbreaking, still suffer performance limitations due to their iterative nature and high computational costs. Central to our approach is the strategic selection of influential pixels within the point-spread-function, reducing redundant computations by focusing only on pixels contributing to a significant portion of the point-spread-function's total intensity. In addition, we explore the potential to directly invert the image formation model, bypass iterative computations, and further accelerate the deconvolution process. Our findings reveal notable improvements in computational efficiency, with some of our methods achieving real-time performance. The reconstruction quality, measured using metrics such as the mean squared error, remained comparable to existing approaches, indicating a favorable balance between speed and reconstruction quality. (c) 2025 Optica Publishing Group. All rights, including for text and data mining (TDM), Artificial Intelligence (AI) training, and similar technologies, are reserved.

关键词： Biomedical imaging deep learning Microlens arrays Neural networks Optical design software Plenoptic imaging

来源：评论

学校读者我要写书评

暂无评论

Clamp Dot image Classification Using Neural Network

引用

SENSORS AND MATERIALS 2024年第4期36卷 1431-1439页

作者： Srikam, Krittapak Saenthon, Anakkapon King Mongkuts Inst Technol Ladkrabang Sch Engn Dept Robot & AI Engn 1Soi Chalongkrung 1Chalongkrung St Bangkok 10520 Thailand

In this paper, we discuss the classification of images captured by a machine camera while assembling components. To crop out specific points of interest, we employ image processing. Additionally, we utilize deep learning techniques, specifically convolutional neural networks, to identify the type of equipment being assembled. This approach allows us to determine and record specific parts within a device. However, the main challenge of this project is to achieve both high accuracy and the shortest possible prediction time.

关键词： CNN OpenCV image processing TensorFlow neural network

来源：评论

学校读者我要写书评

暂无评论

Self-adaptive SURF for image-to-video matching

引用

SIGNAL image AND VIDEO processing 2024年第1期18卷 751-759页

作者： Yang, Ming Li, Jiaming Li, Zhigang Li, Wen Zhang, Kairui Kennesaw State Univ Coll Comp & Software Engn Marietta GA 30060 USA Blue Vis Syst LLC Alpharetta GA 30005 USA Rutgers State Univ Sch Art & Sci New Brunswick NJ 08901 USA

Speeded up robust feature (SURF) is one of the most popular feature-based algorithms handling image matching. Compared to emerging deep learning neural network-based image matching algorithms, SURF is much faster with comparable accuracy. Currently, it is still one of the dominant algorithms adopted in majority of real-time applications. With the increasing popularity of video-based computer vision applications, image matching between an image and different frames of a video stream is required. Traditional algorithms could fail to deal with live video because spatiotemporal differences between frames could cause significant fluctuation in the results. In this study, we propose a self-adaptive methodology to improve the stability and precision of image-video matching. The proposed methodology dynamically adjusts threshold in feature points extraction to control the number of extracted feature points based on the content of the previous frame. Minimum ratio of distance (MROD) matching is integrated to preclude false matches while keeping abundant sample sizes. Finally, multiple homography matrix (H-Matrix) are estimated using progressive sample consensus (PROSAC) with various reprojection errors. The model with lowest mean square error (MSE) will be selected for image-to-video frame matching. The experimental results show that the self-adaptive SURF offers more accurate and stable results while balancing single frame processing time in image-video matching.

关键词： image matching Live video SURF MSE PROSAC Homography matrix (H-Matrix) Deviation

来源：评论

学校读者我要写书评

暂无评论

Adaptive specular reflection removal in light field microscopy using multi-polarization hybrid illumination and deep learning

引用

OPTICS AND LASERS IN ENGINEERING 2025年 186卷

作者： Shi, Wenqi Quan, Hongda Kong, Lingbao Fudan Univ Shanghai Engn Res Ctr Ultraprecis Opt Mfg Sch Informat Sci & Technol Shanghai 200433 Peoples R China

Specular highlights pose a significant challenge in light field microscopy (LFM), leading to information loss and inaccurate observations, especially on reflective surfaces. Existing methods for specular highlight removal often suffer from high computational complexity, limited applicability and extended processing times. To address these limitations, this paper introduces an adaptive hybrid illumination scheme that combines multiple polarized light sources with a deep learning-based control system to dynamically modulate illumination and eliminate specular highlights. This method effectively removes highlight reflections and provides uniform illumination without complex optical setups or mechanical components. By leveraging various polarization angles and utilizing precise electronic control through a neural network, the system dynamically adjusts lighting in real-time, achieving uniform illumination and superior image quality. Experimental results show that the proposed method effectively eliminates specular highlights, significantly improves 3D reconstruction accuracy and reduces processing time to <0.4 s (at least twice as fast as traditional approaches). This system offers a promising solution for applications requiring high-speed, high-precision imaging, such as biological analysis, industrial inspection, and materials research, providing an efficient and effective alternative for specular reflection removal in LFM.

关键词： Specular reflection removal Multi-polarization Adaptive hybrid modulated illumination deep learning Light field microscopy Three-dimensional reconstruction

来源：评论

学校读者我要写书评

暂无评论

real-time fire and smoke detection with transfer learning based on cloud-edge collaborative architecture

引用

IET image processing 2024年第12期18卷 3716-3728页

作者： Yang, Ming Qian, Songrong Wu, Xiaoqin Guizhou Univ State Key Lab Publ Big Data Guiyang Guizhou Peoples R China

Recent years have seen increased interest in object detection-based applications for fire detection in digital images and videos from edge devices. The environment's complexity and variability often lead to interference from factors such as fire and smoke characteristics, background noise, and camera settings like angle, sharpness, and exposure, which hampers the effectiveness of fire detection applications. Limited picture data for fire and smoke scenes further challenges model accuracy and robustness, resulting in high false detection and leakage rates. To address the need for efficient detection and adaptability to various environments, this paper focuses on (1) proposing a cloud-edge collaborative architecture for real-time fire and smoke detection, incorporating an iterative transfer learning strategy based on user feedback to enhance adaptability;(2) improving the detection capabilities of the base model YOLOv8 by enhancing the data augmentation method and introducing the coordinate attention mechanism to improve global feature extraction. The improved algorithm shows a 2-point accuracy increase. After three iterations of transfer learning in the production environment, accuracy improves from 93.3% to 96.4%, and mAP0.5:0.95 increases by nearly 5 points. This program effectively addresses false detection issues in fire and smoke detection systems, demonstrating practical applicability. This paper introduces a novel cloud-edge collaborative architecture for real-time fire and smoke detection, featuring an iterative transfer learning approach based on user feedback to enhance adaptability. Employing the YOLOv8 algorithm with improved data augmentation and the coordinate attention mechanism for global feature extraction, the proposed model exhibits a 2-point accuracy boost. After three rounds of transfer learning, improved results are achieved. image

关键词： cloud computing edge detection fires image enhancement image recognition smoke

来源：评论

学校读者我要写书评

暂无评论

Research on Multimodal Biomedical Signal Recognition 7

Research on Multimodal Biomedical Signal Recognition

引用

7th International Conference on Automation Electronics and Electrical Engineering

作者： Liu, Zhaoyu South China Univ Technol Dept Elect Business Guangzhou 510006 Peoples R China

ISBN: (纸本)9798350377040;9798350377033

This study proposes a multimodal biomedical signal recognition algorithm that aims to fuse image data and biosignal data to improve the accuracy and real-time performance of medical data analysis. The algorithm employs a deep learning model to independently recognize the two data types and introduces a cross-attention mechanism to assess the correlation between image and biosignal features. The mechanism dynamically adjusts the attention weights so as to effectively suppress irrelevant features and promote the fusion of image and signal features. Specifically, the biosignal features are extracted by the temporal Kolmogorov-Arnold network (TKAN), while the image data are processed by the YOLOv11 network. These two feature vectors are aligned and fused through the cross-attention mechanism, and finally the fused feature vectors are generated and classified through the fully connected layer to obtain the final diagnosis results. This paper also explores pre-processing techniques for image and signal data, including methods for denoising, signal enhancement, data normalization and image alignment. These preprocessing steps effectively improve the quality of the raw data and provide clearer and more accurate inputs for subsequent deep learning models. Through experimental verification, the proposed algorithm outperforms traditional methods in a simulated environment, especially in real-time data analysis and multimodal fusion accuracy.

关键词： multimodal medical biosignal cross-attention feature fusion deep learning

来源：评论

学校读者我要写书评

暂无评论

An Impact Study of deep learning-based Low-light image Enhancement in Intelligent Transportation Systems

An Impact Study of Deep Learning-based Low-light Image Enhan...

引用

Conference on Multimodal image Exploitation and learning

作者： Jinadu, Obafemi Rajeev, Srijith Panetta, Karen A. Agaian, Sos S. Tufts Univ Dept Elect & Comp Engn Medford MA 02155 USA City Univ New York Dept Comp Sci New York NY USA

ISBN: (纸本)9781510673854;9781510673847

images and videos captured in poor illumination conditions are degraded by low brightness, reduced contrast, color distortion, and noise, rendering them barely discernable for human perception and ultimately negatively impacting computer vision system performance. These challenges are exasperated when processing video surveillance camera footage, using this unprocessed video data as-is for real-time computer vision tasks across varying environmental conditions within Intelligent Transportation Systems (ITS), such as vehicle detection, tracking, and timely incident detection. The inadequate performance of these algorithms in real-world deployments incurs significant operational costs. Low-light image enhancement (LLIE) aims to improve the quality of images captured in these unideal conditions. Groundbreaking advancements in LLIE have been recorded employing deep-learning techniques to address these challenges, however, the plethora of models and approaches is varied and disparate. This paper presents an exhaustive survey to explore a methodical taxonomy of state-of-the-art deep learning-based LLIE algorithms and their impact when used in tandem with other computer vision algorithms, particularly detection algorithms. To thoroughly evaluate these LLIE models, a subset of the BDD100K dataset, a diverse real-world driving dataset is used for suitable image quality assessment and evaluation metrics. This study aims to provide a detailed understanding of the dynamics between low-light image enhancement and ITS performance, offering insights into both the technological advancements in LLIE and their practical implications in real-world conditions. The project Github repository can be accessed here.

关键词： low-light image enhancement deep learning image restoration computer vision

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：