咨询与建议

限定检索结果

文献类型

  • 84 篇 期刊文献
  • 76 篇 会议

馆藏范围

  • 160 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 153 篇 工学
    • 125 篇 计算机科学与技术...
    • 56 篇 电气工程
    • 22 篇 软件工程
    • 19 篇 信息与通信工程
    • 15 篇 控制科学与工程
    • 9 篇 电子科学与技术(可...
    • 8 篇 生物医学工程(可授...
    • 4 篇 仪器科学与技术
    • 3 篇 交通运输工程
    • 2 篇 机械工程
    • 2 篇 土木工程
    • 2 篇 测绘科学与技术
    • 1 篇 材料科学与工程(可...
    • 1 篇 水利工程
    • 1 篇 农业工程
    • 1 篇 环境科学与工程(可...
  • 24 篇 医学
    • 12 篇 临床医学
    • 8 篇 特种医学
    • 7 篇 基础医学(可授医学...
  • 17 篇 理学
    • 8 篇 物理学
    • 4 篇 化学
    • 3 篇 数学
    • 3 篇 生物学
    • 2 篇 地理学
    • 1 篇 天文学
    • 1 篇 大气科学
    • 1 篇 地球物理学
    • 1 篇 地质学
  • 10 篇 管理学
    • 8 篇 管理科学与工程(可...
    • 2 篇 公共管理
  • 1 篇 农学

主题

  • 160 篇 vision-language ...
  • 13 篇 visualization
  • 11 篇 large language m...
  • 11 篇 prompt learning
  • 10 篇 clip
  • 10 篇 training
  • 10 篇 prompt tuning
  • 9 篇 object detection
  • 8 篇 adaptation model...
  • 7 篇 deep learning
  • 7 篇 semantics
  • 7 篇 few-shot learnin...
  • 6 篇 knowledge distil...
  • 5 篇 multimodal learn...
  • 5 篇 task analysis
  • 5 篇 continual learni...
  • 5 篇 zero-shot learni...
  • 5 篇 feature extracti...
  • 5 篇 foundation model
  • 4 篇 tuning

机构

  • 4 篇 shanghai ai lab ...
  • 4 篇 peng cheng lab p...
  • 3 篇 univ sci & techn...
  • 3 篇 chinese univ hon...
  • 3 篇 sensetime res pe...
  • 2 篇 univ michigan an...
  • 2 篇 hong kong polyte...
  • 2 篇 sun yat sen univ...
  • 2 篇 shanghai univ pe...
  • 2 篇 emory univ dept ...
  • 2 篇 beijing univ tec...
  • 2 篇 univ chinese aca...
  • 2 篇 wuhan univ sch c...
  • 2 篇 harbin inst tech...
  • 2 篇 tongji univ coll...
  • 2 篇 northeastern uni...
  • 2 篇 chinese acad sci...
  • 2 篇 xidian univ sch ...
  • 2 篇 tianjin univ col...
  • 2 篇 chongqing univ p...

作者

  • 5 篇 qiao yu
  • 3 篇 gao peng
  • 3 篇 obinata yoshiki
  • 3 篇 inaba masayuki
  • 3 篇 kawaharazuka ken...
  • 3 篇 wang ruixuan
  • 3 篇 dai jifeng
  • 3 篇 okada kei
  • 3 篇 kanazawa naoaki
  • 2 篇 zhou jie
  • 2 篇 wang lei
  • 2 篇 li xin
  • 2 篇 chen zhe
  • 2 篇 guo tao
  • 2 篇 luo ping
  • 2 篇 zhang tong
  • 2 篇 yang xiaofeng
  • 2 篇 yang xi
  • 2 篇 liu liangchen
  • 2 篇 fang zhen

语言

  • 159 篇 英文
  • 1 篇 德文
  • 1 篇 法文
  • 1 篇 其他
检索条件"主题词=Vision-language Model"
160 条 记 录,以下是31-40 订阅
排序:
vision-language models in medical image analysis: From simple fusion to general large models
收藏 引用
INFORMATION FUSION 2025年 118卷
作者: Li, Xiang Li, Like Jiang, Yuchen Wang, Hao Qiao, Xinyu Feng, Ting Luo, Hao Zhao, Yong Northeastern Univ Coll Informat Sci & Engn Shenyang 110819 Peoples R China Hebei Key Lab Micronano Precis Opt Sensing & Measu Qinhuangdao 066004 Peoples R China Harbin Inst Technol Dept Control Sci & Engn Harbin 150001 Peoples R China State Key Lab Synthet Automat Proc Ind Shenyang 110819 Peoples R China
vision-language model (VLM) is a kind of multi-modality deep learning model that aims to fuse visual information with language information to enhance the understanding and analysis of visual content. VLM was originall... 详细信息
来源: 评论
vision-language models Can Identify Distracted Driver Behavior From Naturalistic Videos
收藏 引用
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS 2024年 第9期25卷 11602-11616页
作者: Hasan, Md. Zahid Chen, Jiajing Wang, Jiyang Rahman, Mohammed Shaiqur Joshi, Ameya Velipasalar, Senem Hegde, Chinmay Sharma, Anuj Sarkar, Soumik Iowa State Univ Dept Elect & Comp Engn Ames IA 50010 USA Syracuse Univ Dept Elect Engn & Comp Sci Syracuse NY 13244 USA Iowa State Univ Dept Comp Sci Ames IA 50010 USA NYU Dept Elect & Comp Engn Brooklyn NY 11201 USA Iowa State Univ Dept Civil Construct & Environm Engn Ames IA 50010 USA Iowa State Univ Dept Mech Engn Ames IA 50010 USA
Recognizing the activities causing distraction in real-world driving scenarios is critical for ensuring the safety and reliability of both drivers and pedestrians on the roadways. Conventional computer vision techniqu... 详细信息
来源: 评论
UMPA: Unified multi-modal prompt with adapter for vision-language models
收藏 引用
MULTIMEDIA SYSTEMS 2025年 第2期31卷 1-11页
作者: Jin, Zhengwei Wei, Yun Univ Shanghai Sci & Technol Sch Opt Elect & Comp Engn Shanghai 200093 Peoples R China
Large-scale multi-modal pretraining model, such as CLIP, has shown remarkable generalization in vision-language tasks. However, the transfer of large models to downstream tasks requires large-scale computing resources... 详细信息
来源: 评论
Towards label-free defect detection in additive manufacturing via dual-classifier semi-supervised learning for vision-language models
收藏 引用
JOURNAL OF INTELLIGENT MANUFACTURING 2025年 1-16页
作者: Wang, Kang Liu, Lanqing Xu, Cheng Zou, Jing Lin, Haoneng Fang, Naiyu Jiang, Jingchao Nanyang Technol Univ Singapore Singapore Hong Kong Polytech Univ Hung Hom Kowloon Hong Kong Peoples R China Univ Exeter Exeter England
Complex components can now be fabricated in innovative ways thanks to additive manufacturing (AM) technology, but it also presents a severe challenge in the detection of defects, primarily due to extensive labeling ef... 详细信息
来源: 评论
DGPrompt: Dual-guidance prompts generation for vision-language models
收藏 引用
NEURAL NETWORKS 2025年 188卷
作者: Zheng, Tai Chen, Zhen-Duo Zhang, Zi-Chao Ma, Zhen-Xiang Zhao, Li-Jun Zhang, Chong-Yu Luo, Xin Xu, Xin-Shun Shandong Univ Sch Software 1500 Shunhua Rd Jinan 250101 Peoples R China
Introducing learnable prompts into CLIP and fine-tuning them have demonstrated excellent performance across many downstream tasks. However, existing methods have insufficient interaction between modalities and neglect... 详细信息
来源: 评论
Federated fine-grained prompts for vision-language models based on open-vocabulary object detection
收藏 引用
APPLIED INTELLIGENCE 2025年 第7期55卷 1-15页
作者: Li, Yu China Univ Petr East China Sch Comp Sci & Technol Qingdao 266580 Peoples R China
vision-language models can be used for open-vocabulary object detection. The existing methods suffer from low matching accuracy between prompt and image regions, as well as limited generalization capability as they ad... 详细信息
来源: 评论
LVLM-EHub: A Comprehensive Evaluation Benchmark for Large vision-language models
收藏 引用
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2025年 第3期47卷 1877-1893页
作者: Xu, Peng Shao, Wenqi Zhang, Kaipeng Gao, Peng Liu, Shuo Lei, Meng Meng, Fanqing Huang, Siyuan Qiao, Yu Luo, Ping Shanghai AI Lab OpenGVLab Shanghai 200232 Peoples R China Univ Hong Kong Dept Comp Sci Hong Kong Peoples R China Shanghai AI Lab OpenGVLab Shanghai 200232 Peoples R China Peking Univ Beijing 100871 Peoples R China
Large vision-language models (LVLMs) have recently played a dominant role in multimodal vision-language learning. Despite the great success, it lacks a holistic evaluation of their efficacy. This paper presents a comp... 详细信息
来源: 评论
CLIP-Adapter: Better vision-language models with Feature Adapters
收藏 引用
INTERNATIONAL JOURNAL OF COMPUTER vision 2024年 第2期132卷 581-595页
作者: Gao, Peng Geng, Shijie Zhang, Renrui Ma, Teli Fang, Rongyao Zhang, Yongfeng Li, Hongsheng Qiao, Yu Shanghai AI Lab Shanghai Peoples R China Rutgers State Univ New Brunswick NJ USA Chinese Univ Hong Kong Hong Kong Peoples R China
Large-scale contrastive vision-language pretraining has shown significant progress in visual representation learning. Unlike traditional visual systems trained by a fixed set of discrete labels, a new paradigm was int... 详细信息
来源: 评论
Multi-Modal Attribute Prompting for vision-language models
收藏 引用
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 2024年 第11期34卷 11579-11591页
作者: Liu, Xin Wu, Jiamin Yang, Wenfei Zhou, Xu Zhang, Tianzhu Univ Sci & Technol China Sch Informat Sci & Technol Hefei 230027 Peoples R China Sangfor Technol Inc Shenzhen 518000 Peoples R China
Pre-trained vision-language models (VLMs), like CLIP, exhibit strong generalization ability to downstream tasks but struggle in few-shot scenarios. Existing prompting techniques primarily focus on global text and imag... 详细信息
来源: 评论
VaVLM: Toward Efficient Edge-Cloud Video Analytics With vision-language models
收藏 引用
IEEE TRANSACTIONS ON BROADCASTING 2025年
作者: Zhang, Yang Wang, Hanling Bai, Qing Liang, Haifeng Zhu, Peican Muntean, Gabriel-Miro Li, Qing Xian Technol Univ Sch Optoelect Engn Xian 710021 Shaanxi Peoples R China Pengcheng Lab Dept Adv Interdisciplinary Res Shenzhen 518055 Guangdong Peoples R China Tsinghua Univ Shenzhen Int Grad Sch Shenzhen 518055 Guangdong Peoples R China Northern Optoelect Co Ltd NORTHEO Dept Qual Safety Xian 710043 Shaanxi Peoples R China Northwestern Polytech Univ Sch Artificial Intelligence Opt & Elect Xian 710072 Shaanxi Peoples R China Dublin City Univ Sch Elect Engn Dublin 9 Ireland
The advancement of Large language models (LLMs) with vision capabilities in recent years has elevated video analytics applications to new heights. To address the limited computing and bandwidth resources on edge devic... 详细信息
来源: 评论