检索结果-内蒙古大学图书馆

pixel-level semantic parsing in complex industrial scenarios using large vision-language models

INFORMATION FUSION 2025年 116卷

作者： Ji, Xiaofeng Gong, Faming Wang, Nuanlai Zhao, Yanpu Ma, Yuhui Shi, Zhuang China Univ Petr East China Coll Comp Sci & Technol Qingdao 266580 Peoples R China Aerosp Informat Res Inst QiLu Lab 32 Jinan 250100 Peoples R China

The emergence of vision-language models, particularly Contrastive Language-Image Pre-Training (CLIP), has significantly improved the performance of numerous visual tasks, demonstrating notable zero-shot transfer abilities. CLIP's remarkable generalization ability offers substantial innovation potential for smart manufacturing and public safety surveillance, potentially accelerating the advancement of Industry 5.0. However, most current research focuses on public datasets, with limited investigation into complex industrial scenarios. These industrial scenarios' semantic structures and image qualities differ significantly from the datasets used to train CLIP, presenting challenges for its effectiveness in industrial applications. This paper presents a Context-Aware Masked CLIP (CAM-CLIP) framework for high-performance pixel-level semantic parsing in complex industrial scenarios, under few-shot conditions. The framework autonomously detects and identifies objects in industrial scenarios based on textual descriptions, enhancing safety monitoring and anomaly detection. We constructed a dedicated dataset using offshore drilling platforms as a case study and conducted empirical validation. Results demonstrate that CAM-CLIP achieved an 80.7 mIoU in pixel-level semantic parsing of offshore drilling platforms with a limited sample size, outperforming state-of-the-art methods by 8.47 mIoU. This study extends CLIP's applicability to industrial settings and offers a model for future implementations. It advances semantic parsing in industrial scenarios and promotes the development of intelligent, interpretable systems.

关键词： Visual-language model CAM-CLIP semantic segmentation in industrial pixel-level semantic parsing Complex industrial scenarios

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：