版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Zhejiang Shuren Univ Hangzhou 310015 Peoples R China Int Sci & Technol Cooperat Base Zhejiang Prov Rem Remote Sensing Image Proc & Applicat Hangzhou 310000 Peoples R China Polotsk State Univ Novopolotsk 211440 BELARUS Belarusian State Univ Minsk 220030 BELARUS EarthView Image Inc Huzhou 313200 Peoples R China Natl Acad Sci Belarus United Inst Informat Problems Minsk 220012 BELARUS
出 版 物:《PATTERN RECOGNITION AND IMAGE ANALYSIS》 (模式识别与图形分析)
年 卷 期:2022年第32卷第2期
页 面:254-265页
核心收录:
学科分类:08[工学] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:Public Welfare Technology Applied Research Program of Zhejiang Province [LGF19F020016] National High-End Foreign Experts Program [G2021016028L, G2021016002L, G2021016001L] Zhejiang Shuren University Basic Scientific Research Special Funds
主 题:video surveillance face mask tracking-by-detection motion features loitering
摘 要:The automatic detection and tracking of appearance and behavior anomalies in video surveillance systems is one of the promising areas for the development and implementation of artificial intelligence. In this paper, we present a formalization of these problems. Based on the proposed generalization, a detection and tracking algorithm that uses the tracking-by-detection paradigm and convolutional neural networks (CNNs) is developed. At the first stage, people are detected using the YOLOv5 CNN and are marked with bounding boxes. Then, their faces in the selected regions are detected and the presence or absence of face masks is determined. Our approach to face-mask detection also uses YOLOv5 as a detector and classifier. For this problem, we generate a training dataset by combining the Kaggle dataset and a modified Wider Face dataset, in which face masks were superimposed on half of the images. To ensure a high accuracy of tracking and trajectory construction, the CNN features of the images are included in a composite descriptor, which also contains geometric and color features, to describe each person detected in the current frame and compare this person with all people detected in the next frame. The results of the experiments are presented, including some examples of frames from processed video sequences with visualized trajectories for loitering and falls.