徐清振, 肖彬. 公共空间共享参数的跨模态检索研究[J]. 华南师范大学学报(自然科学版), 2023, 55(1): 88-93. DOI: 10.6054/j.jscnun.2023008
引用本文: 徐清振, 肖彬. 公共空间共享参数的跨模态检索研究[J]. 华南师范大学学报(自然科学版), 2023, 55(1): 88-93. DOI: 10.6054/j.jscnun.2023008
XU Qingzhen, XIAO Bin. A Study of Shared Parameters Cross-modal Retrieval in Common Spaces[J]. Journal of South China Normal University (Natural Science Edition), 2023, 55(1): 88-93. DOI: 10.6054/j.jscnun.2023008
Citation: XU Qingzhen, XIAO Bin. A Study of Shared Parameters Cross-modal Retrieval in Common Spaces[J]. Journal of South China Normal University (Natural Science Edition), 2023, 55(1): 88-93. DOI: 10.6054/j.jscnun.2023008

公共空间共享参数的跨模态检索研究

A Study of Shared Parameters Cross-modal Retrieval in Common Spaces

  • 摘要: 针对跨模态检索中不同模态数据的数据结构和特性存在较大差异的问题,提出了基于公共空间方法的共享参数跨模态检索(SPCMR)方法:首先,利用卷积神经网络提取图像和文本的高级语义特征;然后,接入全连接层将其映射到公共空间并共享2个特征子网的部分隐层权重;最后,连接线性分类器并与标签信息进行判别训练。在公开数据集上采用平均精度(mAP)作为评价指标进行实验。结果表明:SPCMR方法能充分利用跨模态间的语义信息,有效提升图文检索的精度。

     

    Abstract: To address the problem of large differences in data structures and characteristics of different modal data in cross-modal retrieval, the Shared Parameters Cross-modal Retrieval (SPCMR) based on the common space approach is proposed: first, the high-level semantic features of images and text are extracted by convolutional neural networks; then, mapping them to the common space through a fully connected layer and sharing part of the hidden layer weights of the 2 feature subnets; finally, discriminative training is performed by a linear classifier with label information. Experiments are conducted on the public dataset using the mean average precision (mAP) score as evaluation metrics, and the results show that the SPCMR can make full use of the semantic information of cross modal samples and effectively improve the accuracy of image and text retrieval.

     

/

返回文章
返回