咨询与建议

限定检索结果

文献类型

  • 81 篇 期刊文献
  • 74 篇 会议

馆藏范围

  • 155 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 141 篇 工学
    • 116 篇 计算机科学与技术...
    • 48 篇 电气工程
    • 14 篇 软件工程
    • 12 篇 信息与通信工程
    • 8 篇 控制科学与工程
    • 6 篇 生物医学工程(可授...
    • 4 篇 仪器科学与技术
    • 3 篇 交通运输工程
    • 2 篇 机械工程
    • 2 篇 电子科学与技术(可...
    • 2 篇 土木工程
    • 2 篇 测绘科学与技术
    • 1 篇 材料科学与工程(可...
    • 1 篇 水利工程
    • 1 篇 农业工程
    • 1 篇 环境科学与工程(可...
  • 20 篇 医学
    • 10 篇 临床医学
    • 8 篇 特种医学
    • 5 篇 基础医学(可授医学...
  • 16 篇 理学
    • 7 篇 物理学
    • 4 篇 化学
    • 3 篇 数学
    • 3 篇 生物学
    • 2 篇 地理学
    • 1 篇 天文学
    • 1 篇 大气科学
    • 1 篇 地球物理学
    • 1 篇 地质学
  • 10 篇 管理学
    • 8 篇 管理科学与工程(可...
    • 2 篇 公共管理
  • 1 篇 农学

主题

  • 155 篇 vision-language ...
  • 13 篇 visualization
  • 11 篇 large language m...
  • 11 篇 prompt learning
  • 10 篇 clip
  • 10 篇 training
  • 10 篇 prompt tuning
  • 9 篇 object detection
  • 8 篇 adaptation model...
  • 7 篇 deep learning
  • 7 篇 semantics
  • 7 篇 few-shot learnin...
  • 6 篇 knowledge distil...
  • 5 篇 task analysis
  • 5 篇 zero-shot learni...
  • 5 篇 feature extracti...
  • 4 篇 multimodal learn...
  • 4 篇 tuning
  • 4 篇 continual learni...
  • 4 篇 contrastive lear...

机构

  • 4 篇 shanghai ai lab ...
  • 4 篇 peng cheng lab p...
  • 3 篇 univ sci & techn...
  • 3 篇 chinese univ hon...
  • 3 篇 sensetime res pe...
  • 2 篇 univ michigan an...
  • 2 篇 hong kong polyte...
  • 2 篇 sun yat sen univ...
  • 2 篇 shanghai univ pe...
  • 2 篇 beijing univ tec...
  • 2 篇 univ chinese aca...
  • 2 篇 wuhan univ sch c...
  • 2 篇 harbin inst tech...
  • 2 篇 tongji univ coll...
  • 2 篇 northeastern uni...
  • 2 篇 chinese acad sci...
  • 2 篇 xidian univ sch ...
  • 2 篇 tianjin univ col...
  • 2 篇 chongqing univ p...
  • 2 篇 tsinghua univ sh...

作者

  • 5 篇 qiao yu
  • 3 篇 gao peng
  • 3 篇 obinata yoshiki
  • 3 篇 inaba masayuki
  • 3 篇 kawaharazuka ken...
  • 3 篇 wang ruixuan
  • 3 篇 dai jifeng
  • 3 篇 okada kei
  • 3 篇 kanazawa naoaki
  • 2 篇 zhou jie
  • 2 篇 wang lei
  • 2 篇 li xin
  • 2 篇 chen zhe
  • 2 篇 guo tao
  • 2 篇 luo ping
  • 2 篇 zhang tong
  • 2 篇 yang xi
  • 2 篇 liu liangchen
  • 2 篇 fang zhen
  • 2 篇 guo song

语言

  • 154 篇 英文
  • 1 篇 德文
  • 1 篇 法文
  • 1 篇 其他
检索条件"主题词=Vision-language Model"
155 条 记 录,以下是31-40 订阅
排序:
DGPrompt: Dual-guidance prompts generation for vision-language models
收藏 引用
NEURAL NETWORKS 2025年 188卷
作者: Zheng, Tai Chen, Zhen-Duo Zhang, Zi-Chao Ma, Zhen-Xiang Zhao, Li-Jun Zhang, Chong-Yu Luo, Xin Xu, Xin-Shun Shandong Univ Sch Software 1500 Shunhua Rd Jinan 250101 Peoples R China
Introducing learnable prompts into CLIP and fine-tuning them have demonstrated excellent performance across many downstream tasks. However, existing methods have insufficient interaction between modalities and neglect... 详细信息
来源: 评论
DPO: Discrete Prompt Optimization for vision-language models
收藏 引用
IEEE SIGNAL PROCESSING LETTERS 2025年 32卷 671-675页
作者: Liang, Nanhao Liu, Yong Chinese Acad Sci Hefei Inst Phys Sci Hefei 230031 Peoples R China Univ Sci & Technol China Hefei 230026 Peoples R China
In recent years, the emergence of large vision-language models (VLMs) has catalyzed the development of prompt learning, where networks are trained to enhance VLM performance by learning continuous prompts. However, tr... 详细信息
来源: 评论
Dual Modality Prompt Tuning for vision-language Pre-Trained model
收藏 引用
IEEE TRANSACTIONS ON MULTIMEDIA 2024年 26卷 2056-2068页
作者: Xing, Yinghui Wu, Qirui Cheng, De Zhang, Shizhou Liang, Guoqiang Wang, Peng Zhang, Yanning Northwestern Polytech Univ Sch Comp Sci Xian 710072 Peoples R China Northwestern Polytech Univ Shenzhen Res Dev Inst Shenzhen 518057 Peoples R China Xidian Univ Sch Telecommun Engn Xian 710071 Peoples R China
With the emergence of large pretrained vison-language models such as CLIP, transferable representations can be adapted to a wide range of downstream tasks via prompt tuning. Prompt tuning probes for beneficial informa... 详细信息
来源: 评论
vision-language models for Design Concept Generation: An Actor–Critic Framework
收藏 引用
Journal of Mechanical Design 2025年 第9期147卷 091402页
作者: Ghasemi, Parisa Moghaddam, Mohsen George W. Woodruff School of Mechanical Engineering Georgia Institute of Technology Atlanta GA 30332
We introduce a novel actor-critic framework that utilizes vision-language models (VLMs) and large language models (LLMs) for design concept generation, particularly for producing a diverse array of innovative solution... 详细信息
来源: 评论
vision-language models in medical image analysis: From simple fusion to general large models
收藏 引用
INFORMATION FUSION 2025年 118卷
作者: Li, Xiang Li, Like Jiang, Yuchen Wang, Hao Qiao, Xinyu Feng, Ting Luo, Hao Zhao, Yong Northeastern Univ Coll Informat Sci & Engn Shenyang 110819 Peoples R China Hebei Key Lab Micronano Precis Opt Sensing & Measu Qinhuangdao 066004 Peoples R China Harbin Inst Technol Dept Control Sci & Engn Harbin 150001 Peoples R China State Key Lab Synthet Automat Proc Ind Shenyang 110819 Peoples R China
vision-language model (VLM) is a kind of multi-modality deep learning model that aims to fuse visual information with language information to enhance the understanding and analysis of visual content. VLM was originall... 详细信息
来源: 评论
Improving Anomaly Scene Recognition with Large vision-language models  18th
Improving Anomaly Scene Recognition with Large Vision-Langu...
收藏 引用
18th International Conference on Wireless Artificial Intelligent Computing Systems and Applications, WASA 2024
作者: Liu, Cheng Long, Xianlei Li, Yan Chen, Chao Gu, Fuqiang Yuan, Songyu Zhang, Chunlong Chongqing University Chongqing401331 China Macquarie University SydneyNSW2109 Australia Zhejiang Lab Hangzhou311121 China
vision-based anomaly scene recognition is important for plenty of applications such as surveillance and security. An efficient way to achieve anomaly scene recognition is to use image-based methods. However, the accur... 详细信息
来源: 评论
Federated Prototype Guided Adaption for vision-language models
Federated Prototype Guided Adaption for Vision-Language Mode...
收藏 引用
2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025
作者: Liu, Youchao Huang, Dingjiang School of Data Science and Engineering East China Normal University Shanghai China
Federated Learning (FL) is a new pivotal paradigm for decentralized training on heterogeneous data. Recently fine-tuning of vision-language models (VLMs) has been extended to the federated setting to improve overall p... 详细信息
来源: 评论
LLaVA-SG: Leveraging Scene Graphs as Visual Semantic Expression in vision-language models
LLaVA-SG: Leveraging Scene Graphs as Visual Semantic Express...
收藏 引用
2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025
作者: Wang, Jingyi Ju, Jianzhong Luan, Jian Deng, Zhidong Department of Computer Science and Technology BNRist THUAI Tsinghua University Beijing China Xiaomi AI Lab Beijing China
Recent advances in large vision-language models (LVLMs) typically employ vision encoders based on the vision Transformer (ViT) architecture. The division of the images into patches by ViT results in a fragmented perce... 详细信息
来源: 评论
vision-language pre-training for graph-based handwritten mathematical expression recognition
收藏 引用
PATTERN RECOGNITION 2025年 162卷
作者: Guo, Hong-Yu Wang, Chuang Yin, Fei Li, Xiao-Hui Liu, Cheng-Li Univ Chinese Acad Sci Sch Artificial Intelligence Beijing 100049 Peoples R China Chinese Acad Sci Inst Automat State Key Lab Multimodal Artificial Intelligence S Beijing 100190 Peoples R China
vision-language pre-training models have shown promise in improving various downstream tasks. However, handwritten mathematical expression recognition (HMER), as atypical structured learning problem, can hardly benefi... 详细信息
来源: 评论
Generalizable Prompt Learning via Gradient Constrained Sharpness-Aware Minimization
收藏 引用
IEEE TRANSACTIONS ON MULTIMEDIA 2025年 27卷 1100-1113页
作者: Liu, Liangchen Wang, Nannan Zhou, Dawei Liu, Decheng Yang, Xi Gao, Xinbo Liu, Tongliang Xidian Univ Sch Telecommun Engn State Key Lab Integrated Serv Networks Xian 710071 Peoples R China Xidian Univ Sch Cyber Engn State Key Lab Integrated Serv Networks Xian 710071 Peoples R China Chongqing Univ Posts & Telecommun Chongqing Key Lab Image Cognit Chongqing 400065 Peoples R China Univ Sydney Fac Engn Sydney AI Ctr Sch Comp Sci Darlington NSW 2008 Australia
This paper targets a novel trade-off problem in generalizable prompt learning for vision-language models (VLM), i.e., improving the performance on unseen classes while maintaining the performance on seen classes. Comp... 详细信息
来源: 评论