In this paper, we propose an enhanced prototype based on a regional many-to-many attention mechanism for few-shot object detection of forbidden objects such as knives and sticks. Specifically, First, we use the origin...
详细信息
ISBN:
(纸本)9783031234729;9783031234736
In this paper, we propose an enhanced prototype based on a regional many-to-many attention mechanism for few-shot object detection of forbidden objects such as knives and sticks. Specifically, First, we use the original prototype to obtain the invariance of the image to better represent the invariant features of images. Then, we use the enhanced prototype to weight the support features of different query images of knives and sticks to avoid over-fitting. Finally, we use a joint regional consistency loss function to balance and maximize the consistency between the enhanced prototype and the original prototype, which facilitates online learning of invariant object features and improves the efficiency of objectdetection. The results of experiment show that the enhanced prototype can effectively detect knives and sticks, compared with state-of-art methods. Our method achieves significant improvements in both visual and quantitative evaluation metrics.
Deep learning-based intelligent waste detection has been appealing and promising for resource conservation and environmental preservation. However, collecting and labeling numerous samples to train waste detection mod...
详细信息
ISBN:
(纸本)9798331540845;9789887581598
Deep learning-based intelligent waste detection has been appealing and promising for resource conservation and environmental preservation. However, collecting and labeling numerous samples to train waste detection models is time-consuming and labor-intensive. This paper proposes a few-shot waste detection model based on a dual attention mechanism and Dynamic Hard Sample (DHS) triplet loss, named DHS-FSOD, which can effectively recognize and locate a new waste category object only with fine-tuning on a few annotated samples. First, the DHS-FSOD model used deformable convolution in the feature extraction network to improve the detection performance for the same category of objects with different morphology. Then, a dual attention module was designed to help the RPN network improve the quality of proposals and reduce the error classification rate by filtering out the feature information irrelevant to the object category in the query sample. Furthermore, the DHS triplet loss was proposed to improve the model's ability to distinguish wastes with similar appearances but different classes. The effectiveness of DHS-FSOD was verified by ablation and comparison experiments on the MS COCO and Huawei waste datasets.
The dearth of labeled Tangut ancient book pages severely hampers the development of accurate text detection models. To mitigate this issue, we introduce a lightweight few-shot object detection model Tangut-YOLOv8. Dra...
详细信息
ISBN:
(纸本)9798350379860;9798350379877
The dearth of labeled Tangut ancient book pages severely hampers the development of accurate text detection models. To mitigate this issue, we introduce a lightweight few-shot object detection model Tangut-YOLOv8. Drawing inspiration from the Two-stage Fine-tuning Approach(TFA), we also employs a similar strategy. The first stage utilizes a large corpus of Chinese ancient book pages to imbue the model with general text detection capabilities. In the subsequent fine-tuning stage, we fine-tune the model to adapt to the specific characteristics of the Tangut script with a metric component. Experimental results reveal that our model outperforms several few-shot object detection models on the Tangut ancient book page dataset. This approach offers an innovative solution for text detection in ancient book pages, facilitating the preservation and scholarly analysis of this invaluable cultural heritage.
Deep learning methods have attained promising performance on road defect detection from on-board cameras. However, they oftentimes rely heavily on well-annotated datasets with sufficient samples, limiting the practica...
详细信息
Deep learning methods have attained promising performance on road defect detection from on-board cameras. However, they oftentimes rely heavily on well-annotated datasets with sufficient samples, limiting the practical applications when only few labeled samples are available. To fill this gap, this paper proposes a framework based on Faster Region-Convolutional Neural Network (Faster R-CNN) for road defect detection with scarce and cross-domain data. First, a defect weighting branch is developed to enable Faster R-CNN to quickly learn to detect road defects with few annotated data, then a data augmentation method is proposed to enlarge the abundance of annotated data and alleviate the cross-domain issue. Experimental results demonstrate that the proposed framework has attained better performance compared to a state-of-the-art few-shot detector, in terms of an improved mean average precision of 1.83% when only limited samples (i.e., 30 images per category) are provided for training. In the future, the proposed framework could also be extended to other detection tasks with limited data (e.g., construction vehicle detection), allowing humans to reduce their efforts and time required for arduous data collection and annotation.
For the ore particle size detection, obtaining a sizable amount of high-quality ore labeled data is time-consuming and expensive. General objectdetection methods often suffer from severe over-fitting with scarce labe...
详细信息
For the ore particle size detection, obtaining a sizable amount of high-quality ore labeled data is time-consuming and expensive. General objectdetection methods often suffer from severe over-fitting with scarce labeled data. Despite their ability to eliminate over-fitting, existing few-shotobject detectors en-counter drawbacks such as slow detection speed and high memory requirements, making them difficult to implement in a real-world deployment scenario. To this end, we propose a lightweight and effec-tive few-shot detector to achieve competitive performance with general objectdetection with only a few samples for ore images. First, the proposed support feature mining block characterizes the importance of location information in support features. Next, the relationship guidance block makes full use of support features to guide the generation of accurate candidate proposals. Finally, the dual-scale semantic aggrega-tion module retrieves detailed features at different resolutions to contribute with the prediction process. Experimental results show that our method consistently exceeds the few-shot detectors with an excellent performance gap on all metrics. Moreover, our method achieves the smallest model size of 19 MB as well as being competitive at 50 FPS detection speed compared with general object detectors.(c) 2023 Elsevier Ltd. All rights reserved.
As a challenging problem in industrial scenarios, few-shot steel plate surface defect detection aims to detect novel classes given only few defect samples. Most existing few-shot object detection (FSOD) methods usuall...
详细信息
As a challenging problem in industrial scenarios, few-shot steel plate surface defect detection aims to detect novel classes given only few defect samples. Most existing few-shot object detection (FSOD) methods usually cannot accurately detect the complex and diverse steel plate surface defects, especially when the defects share similar appearance. To solve this problem, we propose a novel meta-learning based few-shotdetection method with multi-relation aggregation and adaptive support learning strategy. Our method follows the training paradigm of dual-branch meta learning and tries to exploit the implicit relationships between query and support images. More specifically, we design a Multi-Relational Aggregation (MRA) module to aggregate query and support feature from three different perspectives: the attention relation, the depth-wise convolution relation, and the contrastive relation. MRA module is used to guide the subsequent classification and regression by mining the commonalities and differences between the query and support images in a category-independent manner. Besides, we propose an Adaptive Support Learning (ASL) module to dynamically adjust the weights of support representations in the learning process. We evaluate our method on three datasets of steel plate surface defect (F-SSD), NEU-DET, TianChi Aluminium profile surface defect (F-TCAL) and thorough experiments we demonstrated that our model outperforms existing state-of-the-art methods by a large margin on multiple settings. Our work provides a promising direction for the field of few-shot defect detection and can be generalized to other industrial scenes.
Automatic waste detection in natural environments exhibits a great potential to improve the efficiency and reduce the labor cost of waste management. Recent deep learning-based waste detectors rely heavily on substant...
详细信息
Automatic waste detection in natural environments exhibits a great potential to improve the efficiency and reduce the labor cost of waste management. Recent deep learning-based waste detectors rely heavily on substantial annotated samples for training, but annotating sufficient samples for various categories of waste is labor-intensive and time-consuming. To address this issue, this paper simulates the visual system of human beings and develops a few-shot waste detection framework. To enable the proposed framework more suitable for waste detection, a waste proposal module using a comprehensive feature fusion manner is designed to allow the features of support images to fully interact with those of query images, guiding the framework to generate more potential region proposals containing waste. Also, a waste classification module using soft attention mechanism and foreground mask is designed to alleviate the issue of spatial misalignment and achieve the fine-grained classification towards waste-related proposals. The proposed framework is a general detection framework which can flexibly detect various categories of waste with few labeled samples (i.e., less than 30 instances per category). Experimental results show that the proposed framework achieves a mean average precision of 31.16% over 12 waste categories when only few samples (i.e., 30 instances per category) are provided, surpassing a state-of-the-art few-shot detector named AFDNet by 1.68%. This data scale-insensitive nature allows humans to reduce the effort and time required for laborious waste image collection and annotation, significantly increasing the flexibility of automatic waste detection and boosting the efficiency of waste management.
暂无评论