基于批损失的跨模态检索

The Cross-modal Retrieval Based on Batch Loss

摘要: 针对跨模态检索中成对或三元组样本的方法构造了高度冗余且信息量少的样本对问题，提出了基于批损失的跨模态检索方法(BLCMR)：首先，引入批损失，考虑了嵌入样本的相似性，有效地保持了跨模态样本的不变性；然后，引入迭代方法来修正预测的类别标签，有效地区分了样本的语义类别信息. 在3个公开的数据集(Wikipedia、Pascal Sentence和NUS-WIDE-10k)上的实验结果表明：BLCMR方法能够拉近跨模态样本间的距离，有效地提升最终的跨模态检索精度.

Abstract: Aiming at the problem that the method of couplet or triplet samples in cross-modal retrieval constructs redundant but uninformative sample pairs, a cross-modal retrieval method based on batch loss (BLCMR) is proposed. Firstly, the batch loss is introduced, and by taking into account the similarity of embedded samples, the invariance of cross-modal samples is effectively maintained. Secondly, an iterative method is introduced to modify the predicted category labels and effectively distinguish the semantic category information of the samples. Experimental results on three public datasets (Wikipedia, Pascal Sentence and NUS-WIDE-10k) show that the BLCMR method can effectively improve the accuracy of the final cross-modal retrieval.