咨询与建议

限定检索结果

文献类型

  • 1 篇 会议

馆藏范围

  • 1 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 1 篇 工学
    • 1 篇 计算机科学与技术...

主题

  • 1 篇 spatial-temporal...
  • 1 篇 audio-visual len...
  • 1 篇 reduced tunanle ...
  • 1 篇 pre-trained visi...
  • 1 篇 cross-modal adap...

机构

  • 1 篇 univ toronto on
  • 1 篇 univ texas dalla...

作者

  • 1 篇 wang kai
  • 1 篇 hatzinakos dimit...
  • 1 篇 tian yapeng

语言

  • 1 篇 英文
检索条件"主题词=Spatial-temporal-global Modeling"
1 条 记 录,以下是1-10 订阅
排序:
Towards Efficient Audio-Visual Learners via Empowering Pre-trained Vision Transformers with Cross-Modal Adaptation
Towards Efficient Audio-Visual Learners via Empowering Pre-t...
收藏 引用
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
作者: Wang, Kai Tian, Yapeng Hatzinakos, Dimitrios Univ Toronto Toronto ON Canada Univ Texas Dallas Richardson TX 75083 USA
In this paper, we explore the cross-modal adaptation of pre-trained Vision Transformers (ViTs) for the audio-visual domain by incorporating a limited set of trainable parameters. To this end, we propose a spatial-Temp... 详细信息
来源: 评论