基于自适应法律动态异构图表示学习的司法案例推荐模型

Judicial Case Recommendation Model Based on Adaptive Legal Dynamics Heterogeneous Graph Representation Learning

  • 摘要: 司法案例推荐系统中衡量法律文件之间相似性的功能对于确保司法实践中的一致性至关重要。尽管现有方法尝试通过文本和引文计算相似性,但往往难以准确捕获法律文件的语义,主要原因在于缺少对法律领域知识的深入理解和法律文件间的引用关系。为了解决这一难题,文章构建了一个涵盖案例、分句、关键词和法律条文的法律判决异构图,并提出自适应法律动态异构图表示学习模型(ALDHGRL)。该模型首先采用类型邻居游走采样为法律判决异构图中的每个节点生成多条随机游走序列,根据得到的随机游走序列构建节点的邻居图;然后,利用BiLSTM网络对节点初始化特征进行聚合;其次,使用交叉注意力机制对同类型邻居特征进行聚合;继而,通过注意力机制对不同类型节点的特征进行融合,以得到每个节点的低维表示;最后,通过集成法律条文区分网络,进一步构建优化损失函数,以实现节点的全局最优表示。为验证模型的推荐性能,在CALL-small、CALL-big数据集上,将ALDHGRL模型与4个基线模型(TF-IDF-Glove、GAT、Node2Vec、Metapath2Vec)进行对比实验。实验结果表明:在CAIL-small数据集上,ALDHGRL模型在案例匹配和法律条文推荐任务上的hit@7值达到0.948,ndcg@7值达到0.895,分别比最佳基线模型(GAT)提升了4.7%、5.4%;在CAIL-big数据集上,ALDHGRL模型在案例匹配和法律条文推荐任务上的hit@7值达到0.975,ndcg@7值达到0.912,分别比最佳基线模型(GAT)提升了5.3%、4.5%。实验结果验证了ALDHGRL模型在法律领域推荐系统应用中的有效性和可解释性,并有效解决了因法律条文相似性导致的混淆问题,同时模型的动态邻居图更新机制确保了对新案例的快速适应和准确推荐。消融实验进一步表明,动态邻居图更新机制和法律区分网络对模型性能提升至关重要,二者结合使模型在案例推荐上表现更好。

     

    Abstract: Measuring the similarity between legal documents plays a vital role in ensuring consistency in judicial practice. Although existing methods attempt to compute such similarity based on textual content and citation relationships, they often fail to accurately capture the semantics of legal documents due to insufficient understanding of legal domain knowledge and inter-document citation structures. To address this issue, a heterogeneous legal judgment graph that incorporates cases, sentence segments, keywords, and statutory provisions is constructed. Based on this graph, an Adaptive Legal Dynamic Heterogeneous Graph Representation Learning (ALDHGRL) model is proposed. The model first applies a type-specific neighbor walk sampling strategy to generate multiple random walk sequences for each node in the graph. These sequences are then used to construct the node's neighborhood graph. A bidirectional LSTM network is employed to aggregate the initial node features, followed by a cross-attention mechanism that captures information from same-type neighbors. An attention-based fusion mechanism is further introduced to integrate features across different node types, resulting in a low-dimensional representation for each node. Finally, a legal provision distinction network is incorporated, and an optimization loss function is constructed to achieve globally optimal node embeddings. To evaluate recommendation performance, ALDHGRL is compared with four baseline models (TF-IDF-Glove, GAT, Node2Vec, and Metapath2Vec) on the CALL-small dataset and CALL-big dataset. Experimental results show that on the CALL-big dataset, the ALDHGRL model achieves a hit@ 7 of 0.975 and an ndcg@7 of 0.912 in case matching and law article recommendation tasks, representing improvements of 5.3% and 4.5%, respectively, over the best baseline model (GAT). On the CALL-small dataset, the ALDHGRL model achieves a hit@7 of 0.948 and an ndcg@7 of 0.895, with respective improvements of 4.7% and 5.4% over the best baseline model (GAT). These results demonstrate the model's effectiveness and interpretability in legal recommendation scenarios. Moreover, ALDHGRL model effectively mitigates confusion arising from semantically similar legal provisions, while its dynamic neighborhood graph update mechanism enables rapid adaptation to new cases and ensures accurate recommendations. Ablation studies further confirm that both the dynamic neighborhood graph update mechanism and the legal provision distinction network are critical to performance improvements, and their integration significantly enhances the model's capability in legal case recommendation tasks.

     

/

返回文章
返回