版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Harbin Engn Univ Sch Comp Sci & Technol Harbin 150001 Peoples R China Univ Chinese Acad Sci Sch Comp Sci & Technol Beijing 100019 Peoples R China
出 版 物:《MULTIMEDIA SYSTEMS》 (Multimedia Syst)
年 卷 期:2025年第31卷第1期
页 面:1-16页
核心收录:
学科分类:08[工学] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:National Natural Science Foundation of China 61771155
主 题:Multi-object tracking Appearance changes Graph convolutional neural network Graph attention network
摘 要:Multi-Object Tracking (MOT), an essential task in computer vision, underperforms when existing occlusions or motion blurs, which will cause changes in the object s appearances. We develop three modules based on Graph Neural Networks (GNNs) to handle such appearance changes. The appearance enhancement module boosts appearance features by applying self-attention and Graph Convolutional Neural Network (GCNN) to the local features. The temporal feature updating module automatically updates a tracklet appearance template using GCNNs with different Laplacian operations. The spatial feature updating module encodes interactive spatial features by combining a graph attention network and a GCNN. After processing input video frames with these three modules, our tracker stores all extracted features in a memory bank and then forwards them to a matching algorithm to complete tracking. Using popular benchmark datasets MOT16, MOT17, and MOT20, we show that introducing GNNs to MOT benefits tracking, and the proposed tracker surpasses the state-of-the-art trackers, including StrongSORT, ByteTrack, and BoT-SORT. Specifically, we can achieve 81.1% (77.9%) in MOTA, 80.3% (77.3%) in IDF1, and 65.1% (63.2%) in HOTA on the challenging MOT17 (or the newest MOT20) datasets.