检索结果-内蒙古大学图书馆

DPT‐tracker:Dual pooling transformer for efficient visual tracking

CAAI Transactions on Intelligence Technology 2024年第4期9卷 948-959页

作者： Yang Fang Bailian Xie Uswah Khairuddin Zijian Min Bingbing Jiang Weisheng Li Key Laboratory of Data Engineering and Visual Computing Chongqing University of Posts and TelecommunicationsChongqingChina Department of Mechanical Precision Engineering Malaysia‐Japan International Institute of TechnologyUniversity of Technology MalaysiaKuala LumpurMalaysia Department of Electrical and Computer Engineering Inha UniversityIncheonRepublic of Korea School of Information Science and Technology Hangzhou Normal UniversityHangzhouChina

Transformer tracking always takes paired template and search images as encoder input and conduct feature extraction and target‐search feature correlation by self and/or cross attention operations,thus the model complexity will grow quadratically with the number of input *** alleviate the burden of this tracking paradigm and facilitate practical deployment of Transformer‐based trackers,we propose a dual pooling transformer tracking framework,dubbed as DPT,which consists of three components:a simple yet efficient spatiotemporal attention model(SAM),a mutual correlation pooling Trans-former(MCPT)and a multiscale aggregation pooling Transformer(MAPT).SAM is designed to gracefully aggregates temporal dynamics and spatial appearance information of multi‐frame templates along space‐time *** aims to capture multi‐scale pooled and correlated contextual features,which is followed by MAPT that aggregates multi‐scale features into a unified feature representation for tracking *** tracker achieves AUC score of 69.5 on LaSOT and precision score of 82.8 on Track-ingNet while maintaining a shorter sequence length of attention tokens,fewer parameters and FLOPs compared to existing state‐of‐the‐art(SOTA)Transformer tracking *** experiments demonstrate that DPT tracker yields a strong real‐time tracking baseline with a good trade‐off between tracking performance and inference efficiency.

关键词： human‐computer interfacing image motion analysis pattern recognition signal processing tracking

来源：评论

学校读者我要写书评

暂无评论

On‐device audio‐visual multi‐person wake word spotting

引用

CAAI Transactions on Intelligence Technology 2023年第4期8卷 1578-1589页

作者： Yidi Li Guoquan Wang Zhan Chen Hao Tang Hong Liu Key Laboratory of Machine Perception Peking UniversityShenzhen Graduate SchoolShenzhenChina College of Computer and Information Hefei University of TechnologyHefeiChina Computer Vision Lab ETH ZurichZurichSwitzerland

Audio‐visual wake word spotting is a challenging multi‐modal task that exploits visual information of lip motion patterns to supplement acoustic speech to improve overall detection ***,most audio‐visual wake word spotting models are only suitable for simple single‐speaker scenarios and require high computational *** development is hindered by complex multi‐person scenarios and computational limitations in mobile *** this paper,a novel audio‐visual model is proposed for on‐device multi‐person wake word ***,an attention‐based audio‐visual voice activity detection module is presented,which generates an attention score matrix of audio and visual representations to derive active speaker ***,the knowledge distillation method is introduced to transfer knowledge from the large model to the on‐device model to control the size of our ***,a new audio‐visual dataset,PKU‐KWS,is collected for sentence‐level multi‐person wake word *** results on the PKU‐KWS dataset show that this approach outperforms the previous state‐of‐the‐art methods.

关键词： audio‐visual fusion human‐computer interfacing speech processing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：