咨询与建议

限定检索结果

文献类型

  • 29 篇 会议
  • 24 篇 期刊文献
  • 1 篇 学位论文

馆藏范围

  • 54 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 47 篇 工学
    • 44 篇 计算机科学与技术...
    • 17 篇 电气工程
    • 7 篇 软件工程
    • 5 篇 信息与通信工程
    • 1 篇 电子科学与技术(可...
  • 8 篇 理学
    • 7 篇 物理学
    • 1 篇 生物学
  • 6 篇 医学
    • 6 篇 临床医学
  • 3 篇 教育学
    • 3 篇 心理学(可授教育学...
  • 1 篇 管理学
    • 1 篇 管理科学与工程(可...

主题

  • 54 篇 audio-visual lea...
  • 5 篇 multi-modal lear...
  • 5 篇 visualization
  • 4 篇 task analysis
  • 4 篇 self-supervised ...
  • 4 篇 cross-modal retr...
  • 3 篇 multimodal learn...
  • 3 篇 representation l...
  • 3 篇 deep learning
  • 3 篇 event localizati...
  • 3 篇 sound source loc...
  • 3 篇 contrastive lear...
  • 3 篇 location awarene...
  • 3 篇 action recogniti...
  • 3 篇 feature extracti...
  • 2 篇 spiking neural n...
  • 2 篇 individual diffe...
  • 2 篇 audio-visual cor...
  • 2 篇 transformer
  • 2 篇 zero-shot learni...

机构

  • 3 篇 univ tubingen tu...
  • 2 篇 shanghai ai lab ...
  • 2 篇 univ surrey guil...
  • 2 篇 hefei univ techn...
  • 2 篇 beijing inst tec...
  • 1 篇 fudan univ sch c...
  • 1 篇 univ amsterdam
  • 1 篇 baidu inc people...
  • 1 篇 univ paris 05 un...
  • 1 篇 univ geneva fac ...
  • 1 篇 univ las palmas ...
  • 1 篇 univ michigan an...
  • 1 篇 chinese inst bra...
  • 1 篇 beijing univ pos...
  • 1 篇 univ elect sci &...
  • 1 篇 chinese acad sci...
  • 1 篇 czech tech univ ...
  • 1 篇 sichuan univ col...
  • 1 篇 int inst informa...
  • 1 篇 postech dept ele...

作者

  • 3 篇 koepke a. sophia
  • 3 篇 wang meng
  • 3 篇 mercea otniel-bo...
  • 3 篇 guo dan
  • 3 篇 zhou jinxing
  • 3 篇 akata zeynep
  • 2 篇 wang jing
  • 2 篇 liu miao
  • 2 篇 zeng donghuo
  • 2 篇 kim junsik
  • 2 篇 yin jianqin
  • 2 篇 hummel thomas
  • 2 篇 zhong yiran
  • 2 篇 ikeda kazushi
  • 2 篇 mei xinhao
  • 2 篇 kweon in so
  • 2 篇 xie xiang
  • 2 篇 tian yapeng
  • 2 篇 senocak arda
  • 2 篇 li wenrui

语言

  • 54 篇 英文
检索条件"主题词=Audio-Visual Learning"
54 条 记 录,以下是21-30 订阅
排序:
Multisensory Congruency Enhances Explicit Awareness in a Sequence learning Task
收藏 引用
MULTISENSORY RESEARCH 2017年 第7-8期30卷 681-689页
作者: Silva, Andrew E. Barakat, Brandon K. Jimenez, Luis O. Shams, Ladan Univ Calif Los Angeles Los Angeles CA 90095 USA
We examined the effect of audiovisual training on learning a repeated sequence of motor responses. Participants were trained with either congruent or incongruent audiovisual cues to produce motor responses. learning w... 详细信息
来源: 评论
learning to Localize Sound Sources in visual Scenes: Analysis and Applications
收藏 引用
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021年 第5期43卷 1605-1619页
作者: Senocak, Arda Oh, Tae-Hyun Kim, Junsik Yang, Ming-Hsuan Kweon, In So Korea Adv Inst Sci & Technol Sch Elect Engn Daejeon 34141 South Korea POSTECH Dept Elect Engn Pohang 37673 South Korea Univ Calif Dept Elect Engn & Comp Sci Merced CA 95343 USA
visual events are usually accompanied by sounds in our daily lives. However, can the machines learn to correlate the visual scene and sound, as well as localize the sound source only by observing them like humans? To ... 详细信息
来源: 评论
learning to visually Localize Sound Sources from Mixtures without Prior Source Knowledge
Learning to Visually Localize Sound Sources from Mixtures wi...
收藏 引用
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
作者: Kim, Dongjin Um, Sung Jin Lee, Sangmin Kim, Jung Uk Kyung Hee Univ Seoul South Korea Univ Illinois Urbana IL 61801 USA
The goal of the multi-sound source localization task is to localize sound sources from the mixture individually. While recent multi-sound source localization methods have shown improved performance, they face challeng... 详细信息
来源: 评论
learning SOUND LOCALIZATION BETTER FROM SEMANTICALLY SIMILAR SAMPLES  47
LEARNING SOUND LOCALIZATION BETTER FROM SEMANTICALLY SIMILAR...
收藏 引用
47th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
作者: Senocak, Arda Ryu, Hyeonggon Kim, Junsik Kweon, In So Korea Adv Inst Sci & Technol Daejeon South Korea Harvard Univ Cambridge MA 02138 USA
The objective of this work is to localize the sound sources in visual scenes. Existing audio-visual works employ contrastive learning by assigning corresponding audio-visual pairs from the same source as positives whi... 详细信息
来源: 评论
learning to See and Hear without Human Supervision
Learning to See and Hear without Human Supervision
收藏 引用
作者: Maravilha Morgado, Pedro Miguel University of California San Diego
学位级别:Ph.D., Doctor of Philosophy
Imagine the sound of waves. This sound may evoke the memories of days at the beach. A single sound serves as a bridge to connect multiple instances of a visual scene. It can group scenes that 'go together' and... 详细信息
来源: 评论
Leveraging the Video-Level Semantic Consistency of Event for audio-visual Event Localization
收藏 引用
IEEE TRANSACTIONS ON MULTIMEDIA 2024年 26卷 4617-4627页
作者: Jiang, Yuanyuan Yin, Jianqin Dang, Yonghao Beijing Univ Posts & Telecommun Sch Artificial Intelligence Beijing 100876 Peoples R China
audio-visual event (AVE) localization has attracted much attention in recent years. Most existing methods are often limited to independently encoding and classifying each video segment separated from the full video (w... 详细信息
来源: 评论
Semantic and Relation Modulation for audio-visual Event Localization
收藏 引用
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023年 第6期45卷 7711-7725页
作者: Wang, Hao Zha, Zheng-Jun Li, Liang Chen, Xuejin Luo, Jiebo Univ Sci & Technol China Sch Informat Sci & Technol Hefei 230052 Anhui Peoples R China Chinese Acad Sci Inst Comp Technol Beijing 100045 Peoples R China Univ Rochester Dept Comp Sci Rochester NY 14627 USA
We study the problem of localizing audio-visual events that are both audible and visible in a video. Existing works focus on encoding and aligning audio and visual features at the segment level while neglecting inform... 详细信息
来源: 评论
An audio-visual Speech Separation Model Inspired by Cortico-Thalamo-Cortical Circuits
收藏 引用
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024年 第10期46卷 6637-6651页
作者: Li, Kai Xie, Fenghua Chen, Hang Yuan, Kexin Hu, Xiaolin Tsinghua Univ Inst Artificial Intelligence IDG McGovern Inst Brain Res Tsinghua Lab Brain & Intelligence THBIDept Comp S Beijing 100084 Peoples R China Tsinghua Univ IDG McGovern Inst Brain Res Sch Med Tsinghua Lab Brain & Intelligence THBIDept Biomed Beijing 100084 Peoples R China Chinese Inst Brain Res CIBR Beijing 100010 Peoples R China
audio-visual approaches involving visual inputs have laid the foundation for recent progress in speech separation. However, the optimization of the concurrent usage of auditory and visual inputs is still an active res... 详细信息
来源: 评论
Contrastive Positive Sample Propagation Along the audio-visual Event Line
收藏 引用
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023年 第6期45卷 7239-7257页
作者: Zhou, Jinxing Guo, Dan Wang, Meng Hefei Univ Technol HFUT Sch Comp Sci & Informat Engn Sch Artificial Intelligence Key Lab Knowledge Engn Big Data HFUTMinist Educ Hefei 230601 Peoples R China Hefei Univ Technol HFUT Intelligent Interconnected Syst Lab Anhui Prov Hefei 230601 Peoples R China Hefei Comprehens Natl Sci Ctr Inst Artificial Intelligence Hefei 230601 Peoples R China
visual and audio signals often coexist in natural environments, forming audio-visual events (AVEs). Given a video, we aim to localize video segments containing an AVE and identify its category. It is pivotal to learn ... 详细信息
来源: 评论
Advancing Weakly-Supervised audio-visual Video Parsing via Segment-Wise Pseudo Labeling
收藏 引用
INTERNATIONAL JOURNAL OF COMPUTER VISION 2024年 第11期132卷 5308-5329页
作者: Zhou, Jinxing Guo, Dan Zhong, Yiran Wang, Meng Hefei Univ Technol Hefei Peoples R China Shanghai AI Lab Shanghai Peoples R China Hefei Comprehens Natl Sci Ctr Hefei Peoples R China Anhui Zhonghuitong Technol Co Ltd Hefei Peoples R China
The audio-visual Video Parsing task aims to identify and temporally localize the events that occur in either or both the audio and visual streams of audible videos. It often performs in a weakly-supervised manner, whe... 详细信息
来源: 评论