基于密度峰值聚类标签传播的社区发现方法

Community Detection Algorithm Based on Density Peak Clustering and Label Propagation

  • 摘要: 社区发现的目标是发现复杂网络的结构、行为和组织形式。标签传播算法是一种快速有效的社区发现算法,然而在初始的标签传播算法中,节点的结构信息和特征信息没有得到充分利用,且存在标签传播过程不稳定的问题。针对上述问题,文章提出了一种基于改进的密度峰值聚类算法和标签传播算法的有向加权复杂网络社区发现算法(DPC-LPA)。该算法首先根据节点的结构和特征对其进行加权,充分利用了结构信息和特征信息;然后,采用改进的密度峰值聚类算法来寻找网络的社区中心,并据此构建初始社区,提高了社区划分的质量;其次,基于节点相似度和节点权重,合理确定标签传播的更新顺序,并通过衡量节点间标签传播的强度来完成标签传播,解决了标签传播算法不稳定的问题。最后,在CiteSeer、Cora、WebKB和SCHOLAT真实数据集上,将DPC-LPA算法与DCN、WCF-LPA、CLPE算法进行对比实验。实验结果证明了DPC-LPA算法的可行性和有效性:从模块度来看,利用DPC-LPA算法划分的社区具有更加显著的社区结构;从调整兰德系数来看,DPC-LPA算法的社区划分质量更稳定;从运行时间来看,DPC-LPA算法具有较高的效率。

     

    Abstract: The goal of community detection is to discover the structure, behavior and organization of complex networks. Label propagation algorithm is a fast and effective community detection algorithm. However, in the classic label propagation algorithm, the structural and feature information of the node is not fully utilized, and the label pro-pagation process is unstable. To address the above problems, a community detection algorithm DPC-LPA based on improved density peak clustering algorithm and label propagation algorithm in directed weighted complex network is proposed. The algorithm firstly weights the nodes according to their structure and features, which makes full use of the structural and feature information. Then it uses an improved density peak clustering algorithm to find the community center of the network and constructs the initial community accordingly, which improves the quality of community division. And then, based on node similarity and node weights, the update order of label propagation is reasonably determined, and the strength of label propagation between nodes is measured to complete label propagation, which solves the problem of unstable label propagation algorithm. Finally, on CiteSeer, Cora, WebKB, and SCHOLAT real-world datasets, the DPC-LPA algorithm is compared with DCN, WCF-LPA, and CLPE algorithms. The experimental results prove the feasibility and effectiveness of the DPC-LPA algorithm: in terms of modu-larity, the communities divided by the DPC-LPA algorithm have a more significant community structure; in terms of Adjusted Rand Index, the community division quality of the DPC-LPA algorithm is more stable; in terms of running time, the DPC-LPA algorithm has higher efficiency.

     

/

返回文章
返回