基于局部特征匹配和混合对比学习的无监督行人重识别

Unsupervised Person Re-Identification Based on Local Feature Matching and Hybrid Contrastive Learning

  • 摘要: 无监督行人重识别(Unsupervised Person Re-Identification,UPR)技术在安防工程和智慧城市等场景中得到广泛应用。然而,现有的很多UPR算法在特征提取上忽略了局部特征匹配和空间位置特征信息,在伪标签聚类过程中可能丢弃大量未聚类样本。为克服上述缺点,文章提出基于局部特征匹配和混合对比学习的无监督行人重识别方法(LHFC):首先,针对网络不能提取不同空间位置特征信息的问题,在特征提取的骨干网络ResNet50中引入了自相似的非局域注意力机制(Non-local);针对局部特征不匹配的问题,设计了局部特征匹配模块(Aligned),在学习图像相似度的同时考虑了人体结构的匹配;最后,针对训练过程中丢弃未聚类样本从而导致提取特征不充分的问题,提出了聚类级与实例级混合存储器(HCL),以存储聚类级身份特征和离群点实例特征。为验证模型性能的有效性,在2个公开数据集(Market-1501、DukeMTMC-ReID)上与现有的12种无监督方法进行对比。同时,为探讨Non-local、Aligned、HCL对模型效果的影响,进行了消融实验。对比实验结果表明:LHFC方法在Market-1501、DukeMTMC-ReID数据集上的mAP指标分别达到了84.4%、71.5%,相对于12种对比方法中表现最好的CACL方法,指标分别提高了3.5%、1.9%。消融实验结果表明Non-local、Aligned、HCL可以提高指标精度:在ResNet50中引入Non-local有利于提取更多有用的行人特征信息,从而更好地标注局部特征之间的空间位置关系; Aligned模块可以有效融合相对应的人体结构信息; HCL可以减少训练后期伪标签带来的误差。

     

    Abstract: Unsupervised Person Re-Identification (UPR) technology is widely applied in security engineering and smart city applications. However, many existing UPR algorithms neglect local feature matching and spatial location feature information during feature extraction, and may discard a large number of unclustered samples in the pseudo-label clustering process. To overcome the above drawbacks, an unsupervised person re-identification method (LFHC) based on local feature matching and hybrid contrastive learning is proposed. First, to address the issue that the network fails to extract feature information of different spatial locations, a self-similar non-local attention mechanism (Non-local) is introduced into the ResNet50 backbone network for feature extraction. To solve the problem of local feature mismatch, a local feature matching module (Aligned) is designed, which takes into account the mat-ching of human body structures while learning image similarity. Finally, in response to the problem of insufficient feature extraction caused by discarding unclustered samples during the training process, a hybrid memory of cluster-level and instance-level (HCL) is proposed to store cluster-level identity features and outlier instance features. To verify the effectiveness of the model's performance, comparisons are made with 12 existing unsupervised algorithms on two public datasets (Market-1501, DukeMTMC-ReID). Meanwhile, ablation experiments are conducted to explore the impacts of Non-local, Aligned, and HCL on the model's effectiveness. The comparative experimental results indicate that the LHFC model achieves map scores of 84.4% and 71.5% on the Market-1501 and DukeMTMC-ReID datasets, respectively. Compared to the CACL method, which performs the best among the 12 methods, the map scores of LHFC are improved by 3.5% and 1.9%, respectively. The results of the ablation experiments indicate that Non-local, Aligned, and HCL can improve the indicator accuracy: introducing Non-local into ResNet50 is beneficial for extracting more useful person feature information to mark the spatial location relationships between local features; the Aligned module can effectively integrate the corresponding human body structure information; HCL can reduce the errors caused by pseudo-labels in the later stage of training.

     

/

返回文章
返回