Abstract:
Expert finding aims to accurately match user questions with potential experts who can provide high-qua-lity answers, serving as a core task that supports various applications such as question and answering (Q&A) communities, enterprise search, and social networks. However, current research methods are limited to a single core knowledge perspective, making it difficult to comprehensively capture the potential characteristics of questions and experts from multiple perspectives, thereby restricting the accuracy of expert finding. To address this issue, a Large Language Model-based Multi-knowledge Perspective expert finding Model (LLMef) has been proposed. The LLMef model constructs a multi-knowledge perspective modeling mechanism, effectively integrating comprehensive information from three perspectives: core knowledge, prerequisite knowledge, and advanced knowledge, achieving deep and fine-grained representation of questions and experts. Specifically, the LLMef model designs a Question Multi-Knowledge perspective (QMK) encoder for questions based on open source large language model LLaMA-2-7B, enabling the representation of questions from multiple knowledge perspectives. Additionally, it designs an Expert Multi-Knowledge perspective (EMK) aggregator for experts, utilizing an attention mechanism to aggregate multi-knowledge perspective information from experts' historically answered questions, generating a multi-know-ledge perspective aggregated representation of experts. The cooperation of QMK and EMK significantly improves the precision of question and expert modeling. To validate the LLMef model, comparative experiments, ablation experiments, and parameter sensitivity experiments are conducted in the paper, with case studies also supplemented. The results of the comparative experiments demonstrate that the LLMef model has achieved average improvements of 9.83%, 13.58%, and 6.25% in Mean Reciprocal Rank (MRR), Precision@ K (P@ K), and Normalized Discounted Cumulative Gain (NDCG) metrics, respectively, across six public datasets, compared to the best-performing baseline TCQR. In the ablation experiments, compared with the original LLMef model, the MRR metric of the variant models removing the Question Multi-Knowledge Perspective Encoder (QMK) decreases by an average of 7.21%, while that of the variant models removing the Expert Multi-Knowledge Perspective Aggregator (EMK) decreases by an average of 10.43%. In addition, the optimal parameter configuration of the LLMef model is determined through parameter sensitivity experiments, and the internal logic of LLMef in optimizing expert finding performance is further revealed with the help of case studies. The experimental results demonstrate that the LLMef mo-del can more precisely capture the latent knowledge associations between questions and experts, thereby delivering more reliable expert finding outcomes for practical scenarios including Q&A communities and enterprise search systems.