With the rapid development of intelligent video surveillance technology,pedestrian re-identification has become increasingly important inmulti-camera surveillance *** technology plays a critical role in enhancing publ...
详细信息
With the rapid development of intelligent video surveillance technology,pedestrian re-identification has become increasingly important inmulti-camera surveillance *** technology plays a critical role in enhancing public ***,traditional methods typically process images and text separately,applying upstream models directly to downstream *** approach significantly increases the complexity ofmodel training and computational ***,the common class imbalance in existing training datasets limitsmodel performance *** address these challenges,we propose an innovative framework named Person Re-ID Network Based on Visual Prompt Technology andmulti-instance negative pooling(VPM-Net).First,we incorporate the Contrastive Language-Image Pre-training(CLIP)pre-trained model to accurately map visual and textual features into a unified embedding space,effectively mitigating inconsistencies in data distribution and the training *** enhancemodel adaptability and generalization,we introduce an efficient and task-specific Visual Prompt Tuning(VPT)technique,which improves the model’s relevance to specific ***,we design two key modules:the Knowledge-Aware Network(KAN)and themulti-instance negative pooling(MINP)*** KAN module significantly enhances the model’s understanding of complex scenarios through deep contextual semantic *** module handles samples,effectively improving the model’s ability to distinguish fine-grained *** experimental outcomes across diverse datasets underscore the remarkable performance of *** results vividly demonstrate the unique advantages and robust reliability of VPM-Net in fine-grained retrieval tasks.
暂无评论