咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >DPO: Discrete Prompt Optimizat... 收藏

DPO: Discrete Prompt Optimization for Vision-Language Models

作     者:Liang, Nanhao Liu, Yong 

作者机构:Chinese Acad Sci Hefei Inst Phys Sci Hefei 230031 Peoples R China Univ Sci & Technol China Hefei 230026 Peoples R China 

出 版 物:《IEEE SIGNAL PROCESSING LETTERS》 (IEEE Signal Process Lett)

年 卷 期:2025年第32卷

页      面:671-675页

核心收录:

学科分类:0808[工学-电气工程] 08[工学] 

基  金:National Key R&D Program of China [2022YFC2302700] 

主  题:Training Optimization Adaptation models Visualization Overfitting Vectors Vocabulary Signal processing algorithms Stochastic processes Standards Prompt learning vision-language model 

摘      要:In recent years, the emergence of large vision-language models (VLMs) has catalyzed the development of prompt learning, where networks are trained to enhance VLM performance by learning continuous prompts. However, traditional continuous prompt learning often struggles with challenges like overfitting to Base classes and a lack of interpretability due to the nature of prompt parameterization. To overcome these limitations, we introduce Discrete Prompt Optimization (DPO), a method that optimizes text prompts in discrete word-space. During training, scores are assigned to token embeddings, which are then used to select the most effective token sequence for the downstream task. DPO was tested across 11 diverse datasets, consistently outperforming baseline methods like CLIP and CoOp on Novel classes in most cases. This discrete approach not only reduces overfitting but also enhances transparency and model interpretability, enabling the learning of dataset-specific text prompts that are easily understandable.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分