咨询与建议

限定检索结果

文献类型

  • 83 篇 期刊文献
  • 74 篇 会议

馆藏范围

  • 157 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 148 篇 工学
    • 123 篇 计算机科学与技术...
    • 55 篇 电气工程
    • 21 篇 软件工程
    • 19 篇 信息与通信工程
    • 15 篇 控制科学与工程
    • 9 篇 电子科学与技术(可...
    • 6 篇 生物医学工程(可授...
    • 4 篇 仪器科学与技术
    • 3 篇 交通运输工程
    • 2 篇 机械工程
    • 2 篇 土木工程
    • 2 篇 测绘科学与技术
    • 1 篇 材料科学与工程(可...
    • 1 篇 水利工程
    • 1 篇 农业工程
    • 1 篇 环境科学与工程(可...
  • 21 篇 医学
    • 11 篇 临床医学
    • 8 篇 特种医学
    • 5 篇 基础医学(可授医学...
  • 16 篇 理学
    • 7 篇 物理学
    • 4 篇 化学
    • 3 篇 数学
    • 3 篇 生物学
    • 2 篇 地理学
    • 1 篇 天文学
    • 1 篇 大气科学
    • 1 篇 地球物理学
    • 1 篇 地质学
  • 10 篇 管理学
    • 8 篇 管理科学与工程(可...
    • 2 篇 公共管理
  • 1 篇 农学

主题

  • 157 篇 vision-language ...
  • 13 篇 visualization
  • 11 篇 large language m...
  • 11 篇 prompt learning
  • 10 篇 clip
  • 10 篇 training
  • 10 篇 prompt tuning
  • 9 篇 object detection
  • 8 篇 adaptation model...
  • 7 篇 deep learning
  • 7 篇 semantics
  • 7 篇 few-shot learnin...
  • 6 篇 knowledge distil...
  • 5 篇 multimodal learn...
  • 5 篇 task analysis
  • 5 篇 zero-shot learni...
  • 5 篇 feature extracti...
  • 4 篇 tuning
  • 4 篇 continual learni...
  • 4 篇 domain adaptatio...

机构

  • 4 篇 shanghai ai lab ...
  • 4 篇 peng cheng lab p...
  • 3 篇 univ sci & techn...
  • 3 篇 chinese univ hon...
  • 3 篇 sensetime res pe...
  • 2 篇 univ michigan an...
  • 2 篇 hong kong polyte...
  • 2 篇 sun yat sen univ...
  • 2 篇 shanghai univ pe...
  • 2 篇 beijing univ tec...
  • 2 篇 univ chinese aca...
  • 2 篇 wuhan univ sch c...
  • 2 篇 harbin inst tech...
  • 2 篇 tongji univ coll...
  • 2 篇 northeastern uni...
  • 2 篇 chinese acad sci...
  • 2 篇 xidian univ sch ...
  • 2 篇 tianjin univ col...
  • 2 篇 chongqing univ p...
  • 2 篇 tsinghua univ sh...

作者

  • 5 篇 qiao yu
  • 3 篇 gao peng
  • 3 篇 obinata yoshiki
  • 3 篇 inaba masayuki
  • 3 篇 kawaharazuka ken...
  • 3 篇 wang ruixuan
  • 3 篇 dai jifeng
  • 3 篇 okada kei
  • 3 篇 kanazawa naoaki
  • 2 篇 zhou jie
  • 2 篇 wang lei
  • 2 篇 li xin
  • 2 篇 chen zhe
  • 2 篇 guo tao
  • 2 篇 luo ping
  • 2 篇 zhang tong
  • 2 篇 yang xi
  • 2 篇 liu liangchen
  • 2 篇 fang zhen
  • 2 篇 guo song

语言

  • 156 篇 英文
  • 1 篇 德文
  • 1 篇 法文
  • 1 篇 其他
检索条件"主题词=Vision-Language Model"
157 条 记 录,以下是31-40 订阅
排序:
DGPrompt: Dual-guidance prompts generation for vision-language models
收藏 引用
NEURAL NETWORKS 2025年 188卷
作者: Zheng, Tai Chen, Zhen-Duo Zhang, Zi-Chao Ma, Zhen-Xiang Zhao, Li-Jun Zhang, Chong-Yu Luo, Xin Xu, Xin-Shun Shandong Univ Sch Software 1500 Shunhua Rd Jinan 250101 Peoples R China
Introducing learnable prompts into CLIP and fine-tuning them have demonstrated excellent performance across many downstream tasks. However, existing methods have insufficient interaction between modalities and neglect... 详细信息
来源: 评论
DPO: Discrete Prompt Optimization for vision-language models
收藏 引用
IEEE SIGNAL PROCESSING LETTERS 2025年 32卷 671-675页
作者: Liang, Nanhao Liu, Yong Chinese Acad Sci Hefei Inst Phys Sci Hefei 230031 Peoples R China Univ Sci & Technol China Hefei 230026 Peoples R China
In recent years, the emergence of large vision-language models (VLMs) has catalyzed the development of prompt learning, where networks are trained to enhance VLM performance by learning continuous prompts. However, tr... 详细信息
来源: 评论
VaVLM: Toward Efficient Edge-Cloud Video Analytics With vision-language models
收藏 引用
IEEE TRANSACTIONS ON BROADCASTING 2025年
作者: Zhang, Yang Wang, Hanling Bai, Qing Liang, Haifeng Zhu, Peican Muntean, Gabriel-Miro Li, Qing Xian Technol Univ Sch Optoelect Engn Xian 710021 Shaanxi Peoples R China Pengcheng Lab Dept Adv Interdisciplinary Res Shenzhen 518055 Guangdong Peoples R China Tsinghua Univ Shenzhen Int Grad Sch Shenzhen 518055 Guangdong Peoples R China Northern Optoelect Co Ltd NORTHEO Dept Qual Safety Xian 710043 Shaanxi Peoples R China Northwestern Polytech Univ Sch Artificial Intelligence Opt & Elect Xian 710072 Shaanxi Peoples R China Dublin City Univ Sch Elect Engn Dublin 9 Ireland
The advancement of Large language models (LLMs) with vision capabilities in recent years has elevated video analytics applications to new heights. To address the limited computing and bandwidth resources on edge devic... 详细信息
来源: 评论
Dual Modality Prompt Tuning for vision-language Pre-Trained model
收藏 引用
IEEE TRANSACTIONS ON MULTIMEDIA 2024年 26卷 2056-2068页
作者: Xing, Yinghui Wu, Qirui Cheng, De Zhang, Shizhou Liang, Guoqiang Wang, Peng Zhang, Yanning Northwestern Polytech Univ Sch Comp Sci Xian 710072 Peoples R China Northwestern Polytech Univ Shenzhen Res Dev Inst Shenzhen 518057 Peoples R China Xidian Univ Sch Telecommun Engn Xian 710071 Peoples R China
With the emergence of large pretrained vison-language models such as CLIP, transferable representations can be adapted to a wide range of downstream tasks via prompt tuning. Prompt tuning probes for beneficial informa... 详细信息
来源: 评论
vision-language models in medical image analysis: From simple fusion to general large models
收藏 引用
INFORMATION FUSION 2025年 118卷
作者: Li, Xiang Li, Like Jiang, Yuchen Wang, Hao Qiao, Xinyu Feng, Ting Luo, Hao Zhao, Yong Northeastern Univ Coll Informat Sci & Engn Shenyang 110819 Peoples R China Hebei Key Lab Micronano Precis Opt Sensing & Measu Qinhuangdao 066004 Peoples R China Harbin Inst Technol Dept Control Sci & Engn Harbin 150001 Peoples R China State Key Lab Synthet Automat Proc Ind Shenyang 110819 Peoples R China
vision-language model (VLM) is a kind of multi-modality deep learning model that aims to fuse visual information with language information to enhance the understanding and analysis of visual content. VLM was originall... 详细信息
来源: 评论
vision-language models for Design Concept Generation: An Actor–Critic Framework
收藏 引用
Journal of Mechanical Design 2025年 第9期147卷 091402页
作者: Ghasemi, Parisa Moghaddam, Mohsen George W. Woodruff School of Mechanical Engineering Georgia Institute of Technology Atlanta GA 30332
We introduce a novel actor-critic framework that utilizes vision-language models (VLMs) and large language models (LLMs) for design concept generation, particularly for producing a diverse array of innovative solution... 详细信息
来源: 评论
Federated Prototype Guided Adaption for vision-language models
Federated Prototype Guided Adaption for Vision-Language Mode...
收藏 引用
2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025
作者: Liu, Youchao Huang, Dingjiang School of Data Science and Engineering East China Normal University Shanghai China
Federated Learning (FL) is a new pivotal paradigm for decentralized training on heterogeneous data. Recently fine-tuning of vision-language models (VLMs) has been extended to the federated setting to improve overall p... 详细信息
来源: 评论
LLaVA-SG: Leveraging Scene Graphs as Visual Semantic Expression in vision-language models
LLaVA-SG: Leveraging Scene Graphs as Visual Semantic Express...
收藏 引用
2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025
作者: Wang, Jingyi Ju, Jianzhong Luan, Jian Deng, Zhidong Department of Computer Science and Technology BNRist THUAI Tsinghua University Beijing China Xiaomi AI Lab Beijing China
Recent advances in large vision-language models (LVLMs) typically employ vision encoders based on the vision Transformer (ViT) architecture. The division of the images into patches by ViT results in a fragmented perce... 详细信息
来源: 评论
Improving Anomaly Scene Recognition with Large vision-language models  18th
Improving Anomaly Scene Recognition with Large Vision-Langu...
收藏 引用
18th International Conference on Wireless Artificial Intelligent Computing Systems and Applications, WASA 2024
作者: Liu, Cheng Long, Xianlei Li, Yan Chen, Chao Gu, Fuqiang Yuan, Songyu Zhang, Chunlong Chongqing University Chongqing401331 China Macquarie University SydneyNSW2109 Australia Zhejiang Lab Hangzhou311121 China
vision-based anomaly scene recognition is important for plenty of applications such as surveillance and security. An efficient way to achieve anomaly scene recognition is to use image-based methods. However, the accur... 详细信息
来源: 评论
vision-language pre-training for graph-based handwritten mathematical expression recognition
收藏 引用
PATTERN RECOGNITION 2025年 162卷
作者: Guo, Hong-Yu Wang, Chuang Yin, Fei Li, Xiao-Hui Liu, Cheng-Li Univ Chinese Acad Sci Sch Artificial Intelligence Beijing 100049 Peoples R China Chinese Acad Sci Inst Automat State Key Lab Multimodal Artificial Intelligence S Beijing 100190 Peoples R China
vision-language pre-training models have shown promise in improving various downstream tasks. However, handwritten mathematical expression recognition (HMER), as atypical structured learning problem, can hardly benefi... 详细信息
来源: 评论