Abstract:
Measuring the similarity between legal documents plays a vital role in ensuring consistency in judicial practice. Although existing methods attempt to compute such similarity based on textual content and citation relationships, they often fail to accurately capture the semantics of legal documents due to insufficient understanding of legal domain knowledge and inter-document citation structures. To address this issue, a heterogeneous legal judgment graph that incorporates cases, sentence segments, keywords, and statutory provisions is constructed. Based on this graph, an Adaptive Legal Dynamic Heterogeneous Graph Representation Learning (ALDHGRL) model is proposed. The model first applies a type-specific neighbor walk sampling strategy to generate multiple random walk sequences for each node in the graph. These sequences are then used to construct the node's neighborhood graph. A bidirectional LSTM network is employed to aggregate the initial node features, followed by a cross-attention mechanism that captures information from same-type neighbors. An attention-based fusion mechanism is further introduced to integrate features across different node types, resulting in a low-dimensional representation for each node. Finally, a legal provision distinction network is incorporated, and an optimization loss function is constructed to achieve globally optimal node embeddings. To evaluate recommendation performance, ALDHGRL is compared with four baseline models (TF-IDF-Glove, GAT, Node2Vec, and Metapath2Vec) on the CALL-small dataset and CALL-big dataset. Experimental results show that on the CALL-big dataset, the ALDHGRL model achieves a hit@ 7 of 0.975 and an ndcg@7 of 0.912 in case matching and law article recommendation tasks, representing improvements of 5.3% and 4.5%, respectively, over the best baseline model (GAT). On the CALL-small dataset, the ALDHGRL model achieves a hit@7 of 0.948 and an ndcg@7 of 0.895, with respective improvements of 4.7% and 5.4% over the best baseline model (GAT). These results demonstrate the model's effectiveness and interpretability in legal recommendation scenarios. Moreover, ALDHGRL model effectively mitigates confusion arising from semantically similar legal provisions, while its dynamic neighborhood graph update mechanism enables rapid adaptation to new cases and ensures accurate recommendations. Ablation studies further confirm that both the dynamic neighborhood graph update mechanism and the legal provision distinction network are critical to performance improvements, and their integration significantly enhances the model's capability in legal case recommendation tasks.