版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Chinese Acad Sci Hefei Inst Phys Sci Hefei 230031 Peoples R China Univ Sci & Technol China Hefei 230026 Peoples R China
出 版 物:《IEEE SIGNAL PROCESSING LETTERS》 (IEEE Signal Process Lett)
年 卷 期:2025年第32卷
页 面:671-675页
核心收录:
基 金:National Key R&D Program of China [2022YFC2302700]
主 题:Training Optimization Adaptation models Visualization Overfitting Vectors Vocabulary Signal processing algorithms Stochastic processes Standards Prompt learning vision-language model
摘 要:In recent years, the emergence of large vision-language models (VLMs) has catalyzed the development of prompt learning, where networks are trained to enhance VLM performance by learning continuous prompts. However, traditional continuous prompt learning often struggles with challenges like overfitting to Base classes and a lack of interpretability due to the nature of prompt parameterization. To overcome these limitations, we introduce Discrete Prompt Optimization (DPO), a method that optimizes text prompts in discrete word-space. During training, scores are assigned to token embeddings, which are then used to select the most effective token sequence for the downstream task. DPO was tested across 11 diverse datasets, consistently outperforming baseline methods like CLIP and CoOp on Novel classes in most cases. This discrete approach not only reduces overfitting but also enhances transparency and model interpretability, enabling the learning of dataset-specific text prompts that are easily understandable.