检索结果-内蒙古大学图书馆

Conference on Infrared Technology and Applications L

作者： Miwa, Shotaro Otsubo, Shun Jia, Qu Susumu, Yasuaki Mitsubishi Electr Corp 8-1-1 Tsukaguchi Honmachi Amagasaki Hyogo Japan

ISBN: (纸本)9781510674110;9781510674103

Computer vision systems, such as object detection, traditionally rely on supervised learning and predetermined categories, an approach facing limitations when applied to infrared images due to dataset constraints. Emerging contrastive vision-language models, like (Contrastive Language-Image Pre-Training) CLIP, offer a transformative approach through their pre-training on extensive image-text pairs, providing diverse visual representations integrated with language semantics. Our work proposes a novel zero-shot object detection approach for infrared images by extending the benefits of CLIP into this domain. We have developed a two-stage detection system using CLIP for detecting humans in infrared images. The first stage involves region proposal by a (You Only Look Once) YOLO object detector, followed by CLIP in the second stage. When compared with a YOLO model fine-tuned using infrared images, our proposed system demonstrates comparable performance, illustrating its efficacy as a zero-shot object detection approach. This method opens up new avenues for infrared image processing leveraging the capabilities of foundation models.

关键词： zero-shot object detection open-vocabulary object detection vision and language model cross-domain object detection

来源：评论

学校读者我要写书评

暂无评论

ASADA: Attention-Induced Style Alignment domain Adaptation for Traffic object detection 9

ASADA: Attention-Induced Style Alignment Domain Adaptation f...

引用

9th IEEE Smart World Congress, SWC 2023

作者： Mao, Xiao Liu, Zining Zhang, Hui Li, Yidong Ministry of Education Key Laboratory of Big Data & Artificial Intelligence in Transportation China Beijing Jiaotong University School of Computer and Information Technology Beijing100044 China

ISBN: (纸本)9798350319804

The development of deep learning models for intelligent vehicles rely on a large number of reliable data, among which large-scale and accurately labeled traffic scene image data is conducive to promoting the research of intelligent perception algorithms. In contrast to the time-consuming and laborious process of manual annotation, the virtual dataset synthesis method is a convenient and efficient way to replace manual labeling. However, due to the existence of automatically labeled samples in the virtual dataset that are not conducive to model training, and the domain shift between the virtual dataset and the real scene, the model trained from the virtual dataset cannot be well generalized to the real scene. To address these issues, we propose a two-stage label screening strategy for the data processing to remove the interference samples with low information content and further propose a novel efficient traffic scene dataset named Large-Scale Software-Defined Transportation Virtual Dataset (LSTVD). Moreover, an attention-induced style alignment domain adaptation method (ASADA) is proposed to reduce the distribution difference between the virtual data domain and the real-world data domain. Experiment results demonstrate the efficiency of our label screening strategy and our domain adaptation method. © 2023 IEEE.

关键词： cross-domain object detection Label Screening Strategy Virtual Dataset

来源：评论

学校读者我要写书评

暂无评论

Source-Guided Target Feature Reconstruction for cross-domain Classification and detection

引用

IEEE TRANSACTIONS ON IMAGE PROCESSING 2024年 33卷 2808-2822页

作者： Jiao, Yifan Yao, Hantao Bao, Bing-Kun Xu, Changsheng Nanjing Univ Posts & Telecommun Sch Commun & Informat Engn Nanjing 210003 Peoples R China Chinese Acad Sci Inst Automat State Key Lab Multimodal Artificial Intelligence S Beijing 100190 Peoples R China Nanjing Univ Posts & Telecommun Sch Comp Sci Nanjing 210023 Peoples R China Univ Chinese Acad Sci Sch Artificial Intelligence Beijing 100049 Peoples R China

Existing cross-domain classification and detection methods usually apply a consistency constraint between the target sample and its self-augmentation for unsupervised learning without considering the essential source knowledge. In this paper, we propose a Source-guided Target Feature Reconstruction (STFR) module for cross-domain visual tasks, which applies source visual words to reconstruct the target features. Since the reconstructed target features contain the source knowledge, they can be treated as a bridge to connect the source and target domains. Therefore, using them for consistency learning can enhance the target representation and reduce the domain bias. Technically, source visual words are selected and updated according to the source feature distribution, and applied to reconstruct the given target feature via a weighted combination strategy. After that, consistency constraints are built between the reconstructed and original target features for domain alignment. Furthermore, STFR is connected with the optimal transportation algorithm theoretically, which explains the rationality of the proposed module. Extensive experiments on nine benchmarks and two cross-domain visual tasks prove the effectiveness of the proposed STFR module, e.g., 1) cross-domain image classification: obtaining average accuracy of 91.0%, 73.9%, and 87.4% on Office-31, Office-Home, and VisDA-2017, respectively;2) cross-domain object detection: obtaining mAP of 44.50% on Cityscapes -> Foggy Cityscapes, AP on car of 78.10% on Cityscapes -> KITTI, MR(-2 )of 8.63%, 12.27%, 22.10%, and 40.58% on COCOPersons -> Caltech, CityPersons -> Caltech, COCOPersons -> CityPersons, and Caltech -> CityPersons, respectively.

关键词： Source-guided target feature reconstruction cross-domain image classification cross-domain object detection

来源：评论

学校读者我要写书评

暂无评论

Hybrid Matching Teacher Framework for cross-domain Visual detection Transformer

引用

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS 2025年

作者： Wang, Xiaowei Suo, Jinhui Li, Yang Gao, Ming Jiang, Peiwen Dai, Pengwen Hunan Univ Coll Mech & Vehicle Engn State Key Lab Adv Design & Mfg Technol Vehicle Changsha 410082 Peoples R China Hunan Univ Wuxi Intelligent Control Res Inst Wuxi 214115 Peoples R China Sun Yat Sen Univ Sch Cyber Sci & Technol Shenzhen Campus Shenzhen 518107 Peoples R China

object detection is a critical component of autonomous vehicle perception systems. However, domain shifts between training environments and real-world scenarios often degrade detector performance. cross-domain object detection aims to adapt detectors to unlabeled target domains utilizing only labeled source data. Recent popular cross-domain object detection methods employ the mean teacher framework, which uses pseudo-labels generated by the teacher model to guide training on unlabeled real-world data. Despite its effectiveness, continuous training with noisy pseudo-labels leads to abnormal performance degradation in the later stages of training. To address this issue, we propose a novel Hybrid Matching Teacher (HMT) framework for cross-domain visual detection transformers, which enhances cross-domain knowledge transfer across pseudo-label generation, filtering, and training processes. Specifically, we design a Feature Sparse Alignment (FSA) module to adapt DETR tokens and queries, generate domain-adaptive weights to initialize the teacher-student models, and mitigate the inherent initial source bias in the teacher model. Next, a Localization-aware Pseudo-label Filtering (LPF) module ensures high-quality pseudo-labels by considering the consistency between localization and classification tasks. Furthermore, to improve the efficiency of pseudo-label training, the cross-view Hybrid Matching (CHM) module introduces an auxiliary matching branch to increase the number of positive queries that match with pseudo-labels. Extensive experiments demonstrate that our approach achieves state-of-the-art performance, outperforming previous benchmarks by 3.1%, 8.5%, and 4.4% in adverse weather, diverse scenes, and synthetic-to-real, respectively.

关键词： Autonomous vehicles cross-domain object detection domain adaptation mean teacher detection transformer

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：