检索结果-内蒙古大学图书馆

Inference Calibration of Vision-Language Foundation Models for zero-shot and few-shot learning

PATTERN RECOGNITION LETTERS 2025年 192卷 15-21页

作者： Hu, Minyang Chang, Hong Shan, Shiguang Chen, Xilin Chinese Acad Sci Chinese Acad Sci CAS Inst Comp Technol Key Lab Intelligent Informat Proc Beijing 100190 Peoples R China Univ Chinese Acad Sci Beijing 100049 Peoples R China

Contrastive Language-Image Pre-training (CLIP) models exhibit impressive zero-shot performance across various downstream cross-modal tasks by simply computing the dot product between image and text features. CLIP is pre-trained on large-scale image-text pairs using the InfoNCE loss, which maximizes the cosine similarity of positive image-text pairs while minimizing the similarity of negative pairs. However, an objective mismatch exists between the downstream usage and the pre-training phase, as the inference phase fails to exploit information from negative samples. Intuitively, since the CLIP model has been optimized based on the InfoNCE loss, the downstream usage should also be in alignment. In this paper, we start from analyzing the InfoNCE loss and derive its upper bound. Our derivation reveals that the dot-product operation serves a zero-order approximation of this upper bound, while a centralization operation represents a first-order approximation. To address the objective mismatch problem, we propose a novel method, Inference Calibration (IC), which leverages the first-order and second-order moments of data distribution to calibrate features for zero-shot and few-shot scenarios. Experiments on various cross-modal tasks demonstrate the effectiveness of IC in both zero-shot and few-shot scenarios over dot-product operation and other comparative methods.

关键词： Vision-Language Model zero/few-shot learning Contrastive Language-Image Pre-training

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：