咨询与建议

限定检索结果

文献类型

  • 91 篇 会议
  • 62 篇 期刊文献
  • 1 篇 学位论文

馆藏范围

  • 154 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 146 篇 工学
    • 120 篇 计算机科学与技术...
    • 30 篇 电气工程
    • 15 篇 软件工程
    • 13 篇 控制科学与工程
    • 11 篇 信息与通信工程
    • 8 篇 电子科学与技术(可...
    • 7 篇 生物医学工程(可授...
    • 6 篇 测绘科学与技术
    • 4 篇 机械工程
    • 4 篇 仪器科学与技术
    • 4 篇 材料科学与工程(可...
    • 3 篇 交通运输工程
    • 1 篇 航空宇航科学与技...
    • 1 篇 环境科学与工程(可...
    • 1 篇 生物工程
    • 1 篇 安全科学与工程
  • 28 篇 医学
    • 19 篇 临床医学
    • 8 篇 特种医学
    • 4 篇 基础医学(可授医学...
  • 21 篇 理学
    • 8 篇 物理学
    • 7 篇 地球物理学
    • 6 篇 化学
    • 5 篇 生物学
    • 2 篇 地理学
    • 1 篇 数学
    • 1 篇 天文学
    • 1 篇 地质学
    • 1 篇 统计学(可授理学、...
  • 6 篇 管理学
    • 5 篇 管理科学与工程(可...
  • 1 篇 哲学
    • 1 篇 哲学
  • 1 篇 农学

主题

  • 154 篇 vision-language ...
  • 15 篇 large language m...
  • 12 篇 prompt learning
  • 10 篇 clip
  • 10 篇 few-shot learnin...
  • 6 篇 contrastive lear...
  • 6 篇 foundation model...
  • 6 篇 visualization
  • 5 篇 deep learning
  • 4 篇 multimodal learn...
  • 4 篇 object detection
  • 4 篇 long-tailed reco...
  • 4 篇 remote sensing
  • 4 篇 image classifica...
  • 4 篇 artificial intel...
  • 4 篇 computer vision
  • 4 篇 domain generaliz...
  • 4 篇 prompt tuning
  • 3 篇 representation l...
  • 3 篇 image captioning

机构

  • 4 篇 carnegie mellon ...
  • 4 篇 univ chinese aca...
  • 3 篇 inesc tec porto
  • 3 篇 sichuan univ col...
  • 3 篇 univ chinese aca...
  • 3 篇 chinese univ hon...
  • 3 篇 chinese acad sci...
  • 2 篇 shanghai ai lab ...
  • 2 篇 ecole polytech f...
  • 2 篇 tsinghua univ de...
  • 2 篇 harbin inst tech...
  • 2 篇 zhejiang univ pe...
  • 2 篇 univ porto fac e...
  • 2 篇 beijing univ pos...
  • 2 篇 city univ hong k...
  • 2 篇 sichuan univ col...
  • 2 篇 tech univ munich...
  • 2 篇 westlake univ sc...
  • 2 篇 univ elect sci &...
  • 2 篇 johns hopkins un...

作者

  • 4 篇 banerjee biplab
  • 4 篇 zhang yi
  • 4 篇 jha ankit
  • 3 篇 wang donglin
  • 3 篇 singha mainak
  • 3 篇 zhang ce
  • 3 篇 tuia devis
  • 2 篇 men aidong
  • 2 篇 zhang min
  • 2 篇 liu xuyang
  • 2 篇 chen honggang
  • 2 篇 guo miaotian
  • 2 篇 yang yang
  • 2 篇 ricci elisa
  • 2 篇 ye mao
  • 2 篇 tian liang
  • 2 篇 patricio cristia...
  • 2 篇 wang haiying
  • 2 篇 teixeira luis f.
  • 2 篇 mukhopadhyay sou...

语言

  • 152 篇 英文
  • 2 篇 其他
检索条件"主题词=Vision-language Models"
154 条 记 录,以下是21-30 订阅
排序:
One Prompt Word is Enough to Boost Adversarial Robustness for Pre-trained vision-language models
One Prompt Word is Enough to Boost Adversarial Robustness fo...
收藏 引用
IEEE/CVF Conference on Computer vision and Pattern Recognition (CVPR)
作者: Lin, L. Guan, Haoyan Qiu, Jianing Spratling, Michael Kings Coll London London England Imperial Coll London London England
Large pre-trained vision-language models (VLMs) like CLIP, despite having remarkable generalization ability, are highly vulnerable to adversarial examples. This work studies the adversarial robustness of VLMs from the... 详细信息
来源: 评论
language models as Black-Box Optimizers for vision-language models
Language Models as Black-Box Optimizers for Vision-Language ...
收藏 引用
IEEE/CVF Conference on Computer vision and Pattern Recognition (CVPR)
作者: Liu, Shihong Yu, Samuel Lin, Zhiqiu Pathak, Deepak Ramanan, Deva Carnegie Mellon Univ Pittsburgh PA 15213 USA
vision-language models (VLMs) pre-trained on web-scale datasets have demonstrated remarkable capabilities on downstream tasks when fine-tuned with minimal data. However, many VLMs rely on proprietary data and are not ... 详细信息
来源: 评论
LifeGraph 4-Lifelog Retrieval using Multimodal Knowledge Graphs and vision-language models  7
LifeGraph 4-Lifelog Retrieval using Multimodal Knowledge Gra...
收藏 引用
7th Annual ACM Workshop on the Lifelog Search Challenge (LSC)
作者: Rossetto, Luca Kyriakou, Athina Lange, Svenja Ruosch, Florian Wang, Ruijie Wardatzky, Kathrin Bernstein, Abraham Univ Zurich Dept Informat Zurich Switzerland
In the scope of the 7th Lifelog Search Challenge (LSC'24), we present the 4th iteration of LifeGraph, a multimodal knowledge-graph approach with data augmentations using vision-language models (VLM). We extend the... 详细信息
来源: 评论
SocialCounterfactuals: Probing and Mitigating Intersectional Social Biases in vision-language models with Counterfactual Examples
SocialCounterfactuals: Probing and Mitigating Intersectional...
收藏 引用
IEEE/CVF Conference on Computer vision and Pattern Recognition (CVPR)
作者: Howard, Phillip Madasu, Avinash Le, Tiep Moreno, Gustavo Lujan Bhiwandiwalla, Anahita Lal, Vasudev Intel Labs Santa Clara CA 95052 USA
While vision-language models (VLMs) have achieved remarkable performance improvements recently, there is growing evidence that these models also posses harmful biases with respect to social attributes such as gender a... 详细信息
来源: 评论
InsightSee: Advancing Multi-agent vision-language models for Enhanced Visual Understanding  21
InsightSee: Advancing Multi-agent Vision-Language Models for...
收藏 引用
21st IEEE International Conference on Mechatronics and Automation (IEEE ICMA)
作者: Zhang, Huaxiang Mu, Yaojia Zhu, Guo-Niu Gan, Zhongxue Fudan Univ Acad Engn & Technol Shanghai 200433 Peoples R China
Accurate visual understanding is imperative for advancing autonomous systems and intelligent robots. Despite the powerful capabilities of vision-language models (VLMs) in processing complex visual scenes, precisely re... 详细信息
来源: 评论
Dual Memory Networks: A Versatile Adaptation Approach for vision-language models
Dual Memory Networks: A Versatile Adaptation Approach for Vi...
收藏 引用
IEEE/CVF Conference on Computer vision and Pattern Recognition (CVPR)
作者: Zhang, Yabin Zhu, Wenjie Tang, Hui Ma, Zhiyuan Zhou, Kaiyang Zh, Lei HKPolyU Hong Kong Peoples R China OPPO Hong Kong Peoples R China HKUST Hong Kong Peoples R China HKBU Hong Kong Peoples R China
With the emergence of pre-trained vision-language models like CLIP, how to adapt them to various downstream classification tasks has garnered significant attention in recent research. The adaptation strategies can be ... 详细信息
来源: 评论
Neural Collapse Anchored Prompt Tuning for Generalizable vision-language models  24
Neural Collapse Anchored Prompt Tuning for Generalizable Vis...
收藏 引用
30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
作者: Zhu, Didi Li, Zexi Zhang, Min Yuan, Junkun Liu, Jiashuo Kuang, Kun Wu, Chao Zhejiang Univ Hangzhou Peoples R China Tsinghua Univ Beijing Peoples R China
Large-scale vision-language (V-L) models have demonstrated remarkable generalization capabilities for downstream tasks through prompt tuning. However, the mechanisms behind the learned text representations are unknown... 详细信息
来源: 评论
EgoThink: Evaluating First-Person Perspective Thinking Capability of vision-language models
EgoThink: Evaluating First-Person Perspective Thinking Capab...
收藏 引用
IEEE/CVF Conference on Computer vision and Pattern Recognition (CVPR)
作者: Cheng, Sijie Guo, Zhicheng Wu, Jingwen Fang, Kechen Li, Peng Liu, Huaping Liu, Yang Tsinghua Univ Dept Comp Sci & Technol Beijing Peoples R China Tsinghua Univ Inst AI Ind Res AIR Beijing Peoples R China Univ Toronto Dept Elect & Comp Engn Toronto ON Canada Tsinghua Univ Zhili Coll Beijing Peoples R China 01 Ai Beijing Peoples R China
vision-language models (VLMs) have recently shown promising results in traditional downstream tasks. Evaluation studies have emerged to assess their abilities, with the majority focusing on the third-person perspectiv... 详细信息
来源: 评论
TAGGAR: General-Purpose Task Guidance from Natural language in Augmented Reality using vision-language models  24
TAGGAR: General-Purpose Task Guidance from Natural Language ...
收藏 引用
12th Symposium on Spatial User Interaction (SUI)
作者: Stover, Daniel Bowman, Doug A. Virginia Tech Dept Comp Sci Ctr Human Comp Interact Blacksburg VA 24061 USA
Augmented reality (AR) task guidance systems provide assistance for procedural tasks by rendering virtual guidance visuals within the real-world environment. Current AR task guidance systems are limited in that they r... 详细信息
来源: 评论
Experiential Views: Towards Human Experience Evaluation of Designed Spaces using vision-language models
Experiential Views: Towards Human Experience Evaluation of D...
收藏 引用
CHI Conference on Human Factors in Computing Sytems (CHI)
作者: Aseniero, Bon Adriel Lee, Michael Wang, Yi Zhou, Qian Shahmansouri, Nastaran Goldstein, Rhys Autodesk Res Toronto ON Canada
Experiential Views is a proof-of-concept in which we explore a method of helping architects and designers predict how building occupants might experience their designed spaces using AI technology based on vision-Langu... 详细信息
来源: 评论