objectdetection technology has been widely used in many real world applications. With the development of the deep learning method, the accuracy and speed of objectdetection method have been improved significantly, d...
详细信息
objectdetection technology has been widely used in many real world applications. With the development of the deep learning method, the accuracy and speed of objectdetection method have been improved significantly, demonstrating great promises to increase the efficiency of security-related business activities. Nevertheless, the robustness of the existing objectdetectionmethods on security video datasets is still lacking. This could substantially reduce performance in complex application scenarios, such as changeable target size, target occlusion and bad weather. This cannot be solved perfectly by image-basedobjectdetection because a single image's information is limited. On the other hand, the video dataset consists of a series of still images of rich temporal and spatial information, which could be used as supplements for the detectionmethods. based on this idea, this thesis proposes a method that solves the existing problems of the objectdetection method named local information aggregation and global information aggregation based on priori attribute information, and so as to aggregate features selectively by including more of the correlated feature information and less of the uncorrelated feature information. As such, the network could extract and learn more useful target features and abandon the interfered features. The accuracy of the proposed local global information aggregation methods could be improved by 0.9% and 1.1%, respectively compared with one of the most advanced video-based object detection methods MEGA. By adding both two modules, the mAP of the proposed method reaches 84.6% on the public dataset ImageNet VID, which is 1.7% higher than the mAP of MEGA. The proposed method also demonstrates potentials to detect occluded targets with high confidence.
暂无评论