object detection in complex degraded scenarios poses a significant challenge in real-world applications, primarily due to the loss of high-frequency information caused by image degradation, which hinders spatial local...
详细信息
object detection in complex degraded scenarios poses a significant challenge in real-world applications, primarily due to the loss of high-frequency information caused by image degradation, which hinders spatial localization and texture feature capture. To address the above issues, we propose an effective method, dynamic high-/low-frequency knowledge distillation for object detection (DDOD). Specifically, a cross-task distillation strategy is presented to facilitate the restoration of high-frequency information from the blind super-resolution model, enabling the model to more effectively capture subtle textures and improve performance. Furthermore, to optimize feature utilization, a novel dynamic modulated distillation mechanism is designed to adaptively integrate multi-frequency features. Moreover, a low-frequency distillation is introduced to reuse the low-frequency features from both the detection and super-resolution models, promoting accurate target object localization. Experimental evaluations on the VOC and COCO datasets demonstrate that DDOD offers enhanced versatility, achieving superior performance in extreme scenarios and outperforming current state-of-the-art methods. Notably, DDOD incurs no additional computational overhead during inference, preserving parameter efficiency while seamlessly integrating with various detection frameworks.
object detection algorithms struggle with several challenges in low-light conditions, including blurry and dim targets, unclear imaging, and significant loss of detail. These challenges often result in false positives...
详细信息
object detection algorithms struggle with several challenges in low-light conditions, including blurry and dim targets, unclear imaging, and significant loss of detail. These challenges often result in false positives, missed detections, and inaccurate localization. This paper introduces a novel method, DBS-YOLOv8, for enhancing object detection in such environments. First, lightweight deformable convolutions are introduced into the backbone feature extraction network to predict sampling offsets and modulation scales, enhancing the network's feature extraction capabilities under complex low-light backgrounds. Then, the spatial pyramid pooling structure of the backbone network is improved to retain feature information while increasing the receptive field, thus improving the model's computational efficiency and accuracy. Furthermore, the BiFormerBlock module is added to the neck feature fusion network, enhancing the detail processing capability for low-light images. This adjustment allows the network to better focus on regions of interest, reducing the probability of missed or misjudged targets due to insufficient light. Finally, the bounding box regression loss function is optimized to improve the average accuracy of detecting multiple overlapping objects. Experimental results show that the proposed algorithm improves mAP50 by 3.7%, mAP50-95 by 6.7%, precision by 2.4%, and recall by 4.1% compared to the YOLOv8s algorithm.
Currently, mural object detection is highly dependent on traditional manual detection means, which is inefficient and prone to frescoe damage. Therefore, We propose an enhanced mural image detection algorithm, Brg-YOL...
详细信息
Currently, mural object detection is highly dependent on traditional manual detection means, which is inefficient and prone to frescoe damage. Therefore, We propose an enhanced mural image detection algorithm, Brg-YOLO, based on YOLOv8, to achieve efficient, non-contact automatic detection. First, We enhance detection across scales and complex scenes by incorporating a bidirectional feature pyramid network (BiFPN) in the neck, enabling efficient multi-scale feature reuse and improved feature fusion. In addition, we embed the residual squeezing-and-excitation (RSE) attention module in the backbone to mitigate the feature aliasing effect. Finally, with the Ghost+RSE Bottleneck design in the Neck part, we realize a lightweight model deployment that maintains the excellent detection effect while reducing the number of parameters. The experimental results show that the model achieves 84.6% and 47.8% for mAP@0.5 and mAP@0.5:0.95, respectively, in the mural object detection task, which far exceeds similar methods. This study provides new perspectives and tools for mural painting conservation and research, realizes efficient and accurate mural detection through non-contact automatic detection methods, and creates a new paradigm for mural heritage conservation.
object detection plays a crucial role in various applications, including surveillance, autonomous driving, and industrial automation, where accurate and timely identification of objects is essential. This research pro...
详细信息
object detection plays a crucial role in various applications, including surveillance, autonomous driving, and industrial automation, where accurate and timely identification of objects is essential. This research proposes a novel framework that combines the YOLOv8 backbone network with an attention mechanism and a Transformer- based detection head, significantly enhancing object detection performance in real-time images and video. The incorporation of attention mechanisms refines feature extraction from complex scenes, enabling the model to focus on relevant regions within images. Using the integration of Transformer architecture, the model leverages long-range dependencies and global context, leading to more accurate bounding box predictions. The proposed system effectively processes real-time data, demonstrating superior classification performance with precision rates reaching 96.78 % and recall rates of 96.89 %. The mean average precision (mAP) is calculated at 89.67 %, showcasing the framework's robustness across various practical scenarios. The framework is developed to address challenges in object detection, such as detecting multiple objects in crowded environments and varying lighting conditions. The Python architecture supports the implementation of the proposed model. The Python architecture supports the implementation of the proposed model. The results section assesses the Attention Transformer-YOLOv8 model against established algorithms like Faster R-CNN, YOLOv3, YOLOv5n, and SSD, utilizing metrics.
This study identifies and analyzes issues within the management system of the waste home appliances free pickup service and seeks to enhance the system by using an object detection model. To overcome the limitations o...
详细信息
This study identifies and analyzes issues within the management system of the waste home appliances free pickup service and seeks to enhance the system by using an object detection model. To overcome the limitations of manually inspecting approximately 5,000 collections per day, the YOLOv8 model was implemented. Photos for proof of collection, which were difficult to verify visually, were excluded from the image data. Labeling was performed on items defined by waste throughput, resulting in a dataset of 19,101 images. The initial training model achieved performance metrics of 0.950 mAP50 and 0.888 mAP50-95. The detection process for 11,003 images, including saving a summary file, took 7 min and 32.8 s. This method allows the system to automatically identify discrepancies between collection managers' registered data and the actual items collected. Additionally, active learning methods are proposed as a future enhancement strategy for the model. To improve future model performance, some data samples could be selected based on uncertainty by assigning weights to address class imbalance. The model generates bounding boxes for its predictions, and human annotators can verify these results, thereby reducing the cost of manual labeling. This approach can contribute to improving learning efficiency for any new data added later. Experimental results demonstrate that this method effectively resolves class imbalance issues and improves model performance through uncertainty sampling. It is expected that this approach will improve the existing manual inspection systems and maximize the efficiency of the future management process.
In this paper, the Efficient Channel Attention (ECA) mechanism is incorporated at the terminal layer of the YOLOv10 backbone network to enhance the feature expression capability. In addition, Transformer is introduced...
详细信息
In this paper, the Efficient Channel Attention (ECA) mechanism is incorporated at the terminal layer of the YOLOv10 backbone network to enhance the feature expression capability. In addition, Transformer is introduced into the C3 module in the feature extraction process to construct the C3TR module to replace the original C2F module as the deepening network extraction module. In this study, both the ECA mechanism and the self-attention mechanism of Transformer are thoroughly analyzed and integrated into YOLOv10. The C3TR module is used as an important part to deepen the effect of network extraction in backbone network feature extraction. The self-attention mechanism is used to model the long-distance dependency relationship, capture the global contextual information, make up for the limitation of the local sensory field, and enhance the feature expression capability. The ECA module is added to the end of the backbone to globally model the channels of the feature map, distribute channel weights more equitably, and enhance feature expression capability. Extensive experiments on the electrical equipment dataset have demonstrated the high accuracy of the method, with a mAP of 89.4% compared to the original model, representing an improvement of 3.2%. Additionally, the mAP@[0.5, 0.95] reaches 61.8%, which is 5.2% higher than that of the original model.
Road surface damage detection is crucial in highway maintenance and traffic safety maintenance. However, existing detection methods generally suffer from insufficient generalization capability, poor detection of tiny ...
详细信息
Road surface damage detection is crucial in highway maintenance and traffic safety maintenance. However, existing detection methods generally suffer from insufficient generalization capability, poor detection of tiny damage, and difficulty balancing detection accuracy and computational cost. This study proposes a novel road surface damage object detection model (RSDD) to address these challenges. Firstly, a backbone applied to road surface damage feature extraction is designed to solve the problems of feature loss and insufficient extraction of tiny damage during feature extraction. Second, to achieve efficient feature fusion, multiple attention is introduced to optimize features at different stages. Then, a bi-directional feature fusion path is proposed to realize the information exchange between features of different stages, and an enhanced feature pyramid is constructed. Finally, a multi-scale decoupled detection head is adopted to realize the accurate detection of different sizes of damage. Additionally, this study built a road dataset containing rich samples of tiny damage. Extensive comparative experiments are conducted on the collected dataset and a public dataset to validate the generalization performance of RSDD. The experimental results show that RSDD has significant advantages in tiny damage detection while having excellent trade-offs in terms of accuracy, scale, and speed. Specifically, the model achieves 70.8% and 61.2% mAP50 on the two datasets with an inference latency of only 4.5 ms under the condition that the number of parameters is 16.5 M. Compared with YOLOv8s, which has a similar number of parameters, RSDD achieves 5.5% and 3.3% improvement in the detection accuracy, respectively, and speeds up the inference by 0.6 ms.
In the realm of visual simultaneous localization and mapping (SLAM), the conventional assumption of a static environment presents notable challenges for achieving precision in dynamic settings. Moreover, the reliance ...
详细信息
Turnouts and switch machines play a crucial role in facilitating train line operations and establishing routes, making them vital for ensuring the safety and efficiency of railway transportation. Through the gap detec...
详细信息
Turnouts and switch machines play a crucial role in facilitating train line operations and establishing routes, making them vital for ensuring the safety and efficiency of railway transportation. Through the gap detection system of switch machines, the real-time working status of turnouts and switch machines on railway sites can be quickly known. However, due to the challenging working environment and demanding conversion tasks of switch machines, the current gap detection system has often experienced the issues of fault detection. To address this, this study proposes an automatic gap detection method for railway switch machines based on object detection and combination clustering. Firstly, a lightweight object detection network, specifically the MobileNetV3-YOLOv5s model, is used to accurately locate and extract the focal area. Subsequently, the extracted image undergoes preprocessing and is then fed into a combination clustering algorithm to achieve precise segmentation of the gap area and background,the algorithm consists of simple linear iterative clustering, Canopy and kernel fuzzy c-means clustering. Finally, the Fisher optimal segmentation criterion is utilized to divide the data sequence of pixel values, determine the classification nodes and calculate the gap size. The experimental results obtained from switch machine gap images captured in various scenes demonstrate that the proposed method is capable of accurately locating focal areas, efficiently completing gap image segmentation with a segmentation accuracy of 93.55%, and swiftly calculating the gap size with a correct rate of 98.57%. Notably, the method achieves precise detection of gap sizes even after slight deflection of the acquisition camera, aligning it more closely with the actual conditions encountered on railway sites.
In the perception of unmanned systems, small object detection faces numerous challenges, including small size, low resolution, dense distribution, and occlusion, leading to suboptimal perception performance. To addres...
详细信息
In the perception of unmanned systems, small object detection faces numerous challenges, including small size, low resolution, dense distribution, and occlusion, leading to suboptimal perception performance. To address these issues, we propose a specialized algorithm named Unmanned-system Small-object detection-You Only Look Once (USD-YOLO). First, we designed an innovative module called the Anchor-Free Precision Enhancer to achieve more accurate bounding box overlap measurements and provide a smarter processing mechanism, thereby improving the localization accuracy of candidate boxes for small and densely distributed objects. Second, we introduced the Spatial and Channel Reconstruction Convolution module to reduce redundancy in spatial and channel features while extracting key features of small objects. Additionally, we designed a novel C2f-Global Attention Mechanism module to expand the receptive field and capture more contextual information, optimizing the detection head's ability to handle small and low-resolution objects. We conducted extensive experimental comparisons with state-of-the-art models on three mainstream unmanned system datasets and a real unmanned ground vehicle. The experimental results demonstrate that USD-YOLO achieves higher detection precision and faster speed. On the Citypersons dataset, compared with the baseline, USD-YOLO improves mAP50-95, mAP50, and Recall by 8.5%, 5.9%, and 2.3%, respectively. Additionally, on the Flow-Img and DOTA-v1.0 datasets, USD-YOLO improves mAP50-95 by 2.5% and 2.5%, respectively.
暂无评论