基于神经网络的多视图新闻推荐算法

A Neural News Recommendation Based on Multi-aspect Article Representation

  • 摘要: 基于神经网络的新闻推荐方法可以有效地对用户进行个性化新闻推荐,然而在现有的基于神经网络的推荐方法中,新闻的特征没有被充分利用。为了从新闻中提取高度抽象的特征表征,文章提出了一种基于多视图表征的新闻推荐模型(MUSA)。该模型包括2个核心组件:新闻编码器和用户兴趣编码器。在新闻编码器中,结合了Transformer和单词级注意力网络,从标题、摘要、实体、种类和子种类等多个视图学习新闻的表征,利用5个模块分别提取5个视图的新闻信息,并将各个模块获取到的表征进行融合,获得最终的新闻特征。在用户兴趣编码器中,使用了多头自注意力机制和新闻级注意力网络,从用户的历史浏览记录中捕捉其兴趣偏好。最后,在3个真实数据集上,将该模型与NPA、LSTUR、NRMS等模型进行了对比实验;为了探讨新闻编码器中每个模块对模型效果的影响,进行了消融实验;为了探讨实验训练数据集大小对模型效果的影响,进行了训练数据集大小分析实验。对比实验结果表明,MUSA模型的AUC、MRR、nDCG@5和nDCG@10优于其他基线模型。消融实验结果表明多视图的新闻编码方法是最优的。训练数据集大小分析实验表明MUSA模型相比于基线模型具有更好的鲁棒性。

     

    Abstract: Neural network-based news recommendation methods can effectively personalize news recommendations to users, however, the features of news are not fully exploited in existing neural network-based recommendation methods. In order to extract highly abstract feature representations from news, a deep learning model based on multi-view representation (MUSA) is proposed. The model comprises two core components: a news encoder and a user interest encoder. In the news encoder, Transformer and word-level attention network are combined to learn the news representations from multiple views such as title, abstract, entity, category and sub-category, and five modules are used to extract the news information from each of the five views, and the representations obtained from each module are fused to obtain the final news features. In the user interest encoder, multi-head self-attention mechanisms and news-level attention networks are utilized to capture user interest preferences from their historical browsing records. Lastly, the model was compared with NPA, LSTUR, NRMS and other models on three real datasets in a comparative experiment; in order to explore the effect of each module in the news encoder on the model effect, ablation experiments were carried out; in order to explore the effect of the size of the experimental training dataset on the model effect, a training dataset size analysis experiments were conducted. The results of the comparison experiments show that the MUSA model outperforms the other baseline models in terms of performance on AUC, MRR, nDCG@ 5 and nDCG@ 10. The results of the ablation experiments show that the multi-view news coding approach is optimal. The training dataset size analysis experiments show better robustness of the MUSA model compared to the baseline model.

     

/

返回文章
返回