Crack detection is an important measure in the field of structural health monitoring. However, visual crack detection is labor-intensive, time-consuming, inefficient, and expensive. Although image-based detection and ...
详细信息
Crack detection is an important measure in the field of structural health monitoring. However, visual crack detection is labor-intensive, time-consuming, inefficient, and expensive. Although image-based detection and processing provides an efficient way for structural crack detection, its accuracy depends on image quality. For engineering structures, especially bridges, the change of light conditions and the difference of surface characteristics of structural components pose a major challenge to traditional crack detection methods. In this paper, a novel crack detection method based on convolutional neural networks is proposed. The development of this method is divided into the following stages. The initial automated crack classification is carried out by using MobileNetV3, and then the improved deepLabv3+ network is used to segment the classified crack image semantically accurately. Finally, the real crack image is used for verification. To verify the proposed method, several conventional deeplearning networks are trained and compared. The improved deepLabV3+ integrates MobileNetV3 as its feature extraction backbone and incorporates the convolutional block attention module, which achieves 87.79% average intersection and 93.87% average pixel accuracy on public and real data sets. Compared with traditional models such as VGG16, the proposed method shortens the training time by more than 80% while maintaining high detection accuracy. In addition, the compact parameter configuration and moderate model size make it particularly suitable for deployment on mobile detection devices.
This study addresses the critical need for effective mental stress monitoring, linked to severe health issues like depression and heart disease. We introduce a robust method using in-ear photoplethysmogram (PPG) signa...
详细信息
This study addresses the critical need for effective mental stress monitoring, linked to severe health issues like depression and heart disease. We introduce a robust method using in-ear photoplethysmogram (PPG) signals for detecting and classifying stress levels. The objective of this study is to develop a precise stress monitoring technique using advanced signal processing and deeplearning. Raw PPG data were collected from 15 subjects undergoing stress-inducing activities in a controlled setting. The data underwent preprocessing and were transformed into image-like time-frequency representations. We employed vision transformer (ViT) models for classification, which were fine-tuned and compared against other state-of-the-art deeplearning models. The ViT classifier significantly outperformed existing models, achieving an average accuracy of 97.78% and an F1-score of 97.79%. While the dataset is relatively small, these results suggest a promising direction for stress monitoring by illustrating the potential of combining in-ear PPG signals with ViT models. The study indicates the efficacy of this novel approach for accurate mental stress diagnosis, which could have significant implications for mental health applications. Future work will focus on validating these findings with a larger sample size and exploring the integration of this technology into wearable devices for real-world stress monitoring.
Cloud-edge technology enables near-real-time optimization of production lines in group-distributed manufacturing systems. Offloading some tasks to the cloud and processing the remaining tasks on the edge side can impr...
详细信息
Cloud-edge technology enables near-real-time optimization of production lines in group-distributed manufacturing systems. Offloading some tasks to the cloud and processing the remaining tasks on the edge side can improve efficiency of the production optimization. However, due to the complexity of the manufacturing environment and various constraints, an effective offloading strategy is crucial to reduce computing delays and minimize transmission requirements for large-scale optimization requirements. This paper proposes a mixed-integer programming model and a deep reinforcement learning (DRL) framework, based on a Transformer, to address the cloud-edge offloading problem. The DRL framework consists of an encoder and decoder, designed using Transformer. Task offloading decisions are translated into two options: cloud offloading or edge retention. The encoder extracts relevant features for each option, and the decoder generates the probability of selecting each option based on the encoded information. Extensive computational experiments demonstrate the effectiveness of the proposed framework in solving the task offloading problem with time windows, achieving near-real-time optimization of production lines within competitive computational time.
This paper explores the optimization of light field deconvolution, a key process in imageprocessing that reconstructs a 3D object space or a 2D refocus plane from a light field. Despite the critical role of deconvolu...
详细信息
This paper explores the optimization of light field deconvolution, a key process in imageprocessing that reconstructs a 3D object space or a 2D refocus plane from a light field. Despite the critical role of deconvolution in light field technology, existing methods are often slow, computationally intensive, and unsuitable for real-timeprocessing. Existing algorithms, such as the Richardson-Lucy approach, while groundbreaking, still suffer performance limitations due to their iterative nature and high computational costs. Central to our approach is the strategic selection of influential pixels within the point-spread-function, reducing redundant computations by focusing only on pixels contributing to a significant portion of the point-spread-function's total intensity. In addition, we explore the potential to directly invert the image formation model, bypass iterative computations, and further accelerate the deconvolution process. Our findings reveal notable improvements in computational efficiency, with some of our methods achieving real-time performance. The reconstruction quality, measured using metrics such as the mean squared error, remained comparable to existing approaches, indicating a favorable balance between speed and reconstruction quality. (c) 2025 Optica Publishing Group. All rights, including for text and data mining (TDM), Artificial Intelligence (AI) training, and similar technologies, are reserved.
In this paper, we discuss the classification of images captured by a machine camera while assembling components. To crop out specific points of interest, we employ imageprocessing. Additionally, we utilize deep learn...
详细信息
In this paper, we discuss the classification of images captured by a machine camera while assembling components. To crop out specific points of interest, we employ imageprocessing. Additionally, we utilize deeplearning techniques, specifically convolutional neural networks, to identify the type of equipment being assembled. This approach allows us to determine and record specific parts within a device. However, the main challenge of this project is to achieve both high accuracy and the shortest possible prediction time.
Speeded up robust feature (SURF) is one of the most popular feature-based algorithms handling image matching. Compared to emerging deeplearning neural network-based image matching algorithms, SURF is much faster with...
详细信息
Speeded up robust feature (SURF) is one of the most popular feature-based algorithms handling image matching. Compared to emerging deeplearning neural network-based image matching algorithms, SURF is much faster with comparable accuracy. Currently, it is still one of the dominant algorithms adopted in majority of real-time applications. With the increasing popularity of video-based computer vision applications, image matching between an image and different frames of a video stream is required. Traditional algorithms could fail to deal with live video because spatiotemporal differences between frames could cause significant fluctuation in the results. In this study, we propose a self-adaptive methodology to improve the stability and precision of image-video matching. The proposed methodology dynamically adjusts threshold in feature points extraction to control the number of extracted feature points based on the content of the previous frame. Minimum ratio of distance (MROD) matching is integrated to preclude false matches while keeping abundant sample sizes. Finally, multiple homography matrix (H-Matrix) are estimated using progressive sample consensus (PROSAC) with various reprojection errors. The model with lowest mean square error (MSE) will be selected for image-to-video frame matching. The experimental results show that the self-adaptive SURF offers more accurate and stable results while balancing single frame processingtime in image-video matching.
Specular highlights pose a significant challenge in light field microscopy (LFM), leading to information loss and inaccurate observations, especially on reflective surfaces. Existing methods for specular highlight rem...
详细信息
Specular highlights pose a significant challenge in light field microscopy (LFM), leading to information loss and inaccurate observations, especially on reflective surfaces. Existing methods for specular highlight removal often suffer from high computational complexity, limited applicability and extended processingtimes. To address these limitations, this paper introduces an adaptive hybrid illumination scheme that combines multiple polarized light sources with a deeplearning-based control system to dynamically modulate illumination and eliminate specular highlights. This method effectively removes highlight reflections and provides uniform illumination without complex optical setups or mechanical components. By leveraging various polarization angles and utilizing precise electronic control through a neural network, the system dynamically adjusts lighting in real-time, achieving uniform illumination and superior image quality. Experimental results show that the proposed method effectively eliminates specular highlights, significantly improves 3D reconstruction accuracy and reduces processingtime to <0.4 s (at least twice as fast as traditional approaches). This system offers a promising solution for applications requiring high-speed, high-precision imaging, such as biological analysis, industrial inspection, and materials research, providing an efficient and effective alternative for specular reflection removal in LFM.
Recent years have seen increased interest in object detection-based applications for fire detection in digital images and videos from edge devices. The environment's complexity and variability often lead to interf...
详细信息
Recent years have seen increased interest in object detection-based applications for fire detection in digital images and videos from edge devices. The environment's complexity and variability often lead to interference from factors such as fire and smoke characteristics, background noise, and camera settings like angle, sharpness, and exposure, which hampers the effectiveness of fire detection applications. Limited picture data for fire and smoke scenes further challenges model accuracy and robustness, resulting in high false detection and leakage rates. To address the need for efficient detection and adaptability to various environments, this paper focuses on (1) proposing a cloud-edge collaborative architecture for real-time fire and smoke detection, incorporating an iterative transfer learning strategy based on user feedback to enhance adaptability;(2) improving the detection capabilities of the base model YOLOv8 by enhancing the data augmentation method and introducing the coordinate attention mechanism to improve global feature extraction. The improved algorithm shows a 2-point accuracy increase. After three iterations of transfer learning in the production environment, accuracy improves from 93.3% to 96.4%, and mAP0.5:0.95 increases by nearly 5 points. This program effectively addresses false detection issues in fire and smoke detection systems, demonstrating practical applicability. This paper introduces a novel cloud-edge collaborative architecture for real-time fire and smoke detection, featuring an iterative transfer learning approach based on user feedback to enhance adaptability. Employing the YOLOv8 algorithm with improved data augmentation and the coordinate attention mechanism for global feature extraction, the proposed model exhibits a 2-point accuracy boost. After three rounds of transfer learning, improved results are achieved. image
This study proposes a multimodal biomedical signal recognition algorithm that aims to fuse image data and biosignal data to improve the accuracy and real-time performance of medical data analysis. The algorithm employ...
详细信息
ISBN:
(纸本)9798350377040;9798350377033
This study proposes a multimodal biomedical signal recognition algorithm that aims to fuse image data and biosignal data to improve the accuracy and real-time performance of medical data analysis. The algorithm employs a deeplearning model to independently recognize the two data types and introduces a cross-attention mechanism to assess the correlation between image and biosignal features. The mechanism dynamically adjusts the attention weights so as to effectively suppress irrelevant features and promote the fusion of image and signal features. Specifically, the biosignal features are extracted by the temporal Kolmogorov-Arnold network (TKAN), while the image data are processed by the YOLOv11 network. These two feature vectors are aligned and fused through the cross-attention mechanism, and finally the fused feature vectors are generated and classified through the fully connected layer to obtain the final diagnosis results. This paper also explores pre-processing techniques for image and signal data, including methods for denoising, signal enhancement, data normalization and image alignment. These preprocessing steps effectively improve the quality of the raw data and provide clearer and more accurate inputs for subsequent deeplearning models. Through experimental verification, the proposed algorithm outperforms traditional methods in a simulated environment, especially in real-time data analysis and multimodal fusion accuracy.
images and videos captured in poor illumination conditions are degraded by low brightness, reduced contrast, color distortion, and noise, rendering them barely discernable for human perception and ultimately negativel...
详细信息
ISBN:
(纸本)9781510673854;9781510673847
images and videos captured in poor illumination conditions are degraded by low brightness, reduced contrast, color distortion, and noise, rendering them barely discernable for human perception and ultimately negatively impacting computer vision system performance. These challenges are exasperated when processing video surveillance camera footage, using this unprocessed video data as-is for real-time computer vision tasks across varying environmental conditions within Intelligent Transportation Systems (ITS), such as vehicle detection, tracking, and timely incident detection. The inadequate performance of these algorithms in real-world deployments incurs significant operational costs. Low-light image enhancement (LLIE) aims to improve the quality of images captured in these unideal conditions. Groundbreaking advancements in LLIE have been recorded employing deep-learning techniques to address these challenges, however, the plethora of models and approaches is varied and disparate. This paper presents an exhaustive survey to explore a methodical taxonomy of state-of-the-art deeplearning-based LLIE algorithms and their impact when used in tandem with other computer vision algorithms, particularly detection algorithms. To thoroughly evaluate these LLIE models, a subset of the BDD100K dataset, a diverse real-world driving dataset is used for suitable image quality assessment and evaluation metrics. This study aims to provide a detailed understanding of the dynamics between low-light image enhancement and ITS performance, offering insights into both the technological advancements in LLIE and their practical implications in real-world conditions. The project Github repository can be accessed here.
暂无评论