Abstract:
Considering that the existing models cannot completely take advantage of the semantic information of texts and paths, a new model of knowledge graph embedding (named GETR model) is proposed. First, LDA is used to enrich the semantics of an entity description text and TWE is used to obtain word embedding and topic embedding. To enhance the representation of entities, the modified Bi-LSTM model is exploited to encode word embedding and topic embedding. Furthermore, the multiple-step path between two entities is obtained through random walks with the strategy of combining PageRank and Cosine similarity. Additionally, to filter the noise and improve the efficiency of the model, the important semantics of the multi-step path to be used for joint training with the translation model is captured with the self-attention mechanism. Finally, the proposed model GETR, as well as the baseline models TransE, DKRL and TKGE, is evaluated in the tasks of knowledge graph completion and entity classification with three datasets: FB15K, FB20K, and WN18. Experimental results demonstrate that the proposed model outperforms the baseline models, indicating that the new model is more effective for knowledge representation.