• Overview of Chinese core journals
  • Chinese Science Citation Database(CSCD)
  • Chinese Scientific and Technological Paper and Citation Database (CSTPCD)
  • China National Knowledge Infrastructure(CNKI)
  • Chinese Science Abstracts Database(CSAD)
  • JST China
  • SCOPUS
ZHOU Chun, JIANG Yuncheng. Text Labeling Algorithm Based on Conceptual-Semantic Relatedness and LDA[J]. Journal of South China Normal University (Natural Science Edition), 2018, 50(4): 121-128. DOI: 10.6054/j.jscnun.2018088
Citation: ZHOU Chun, JIANG Yuncheng. Text Labeling Algorithm Based on Conceptual-Semantic Relatedness and LDA[J]. Journal of South China Normal University (Natural Science Edition), 2018, 50(4): 121-128. DOI: 10.6054/j.jscnun.2018088

Text Labeling Algorithm Based on Conceptual-Semantic Relatedness and LDA

  • In order to improve the efficiency of text labeling and classification, an automatic text labeling algorithm based on conceptual-semantic relatedness and LDA called TML is proposed. This algorithm can be used to replace manual labeling of text classification tags. The proposed algorithm is based on computing the semantic relatedness between concepts, using LDA (Latent Dirichlet Allocation) to extract the topic representation of texts and then using the results to complete automatic text labeling by computing the expectation that the topic of the text belongs to a certain category. To verify the effectiveness of the TML algorithm, text classifier was used on the standard text categorization data set for supervised text categorization experiments. Three different classifiers (Rocchio, KNN, SVM) were used to perform experiments on three datasets (WebKB, Reuters-21578, and 20-NewsGroup). The experimental results show that the TML algorithm can effectively improve the efficiency of text classification and text labeling.
  • loading

Catalog

    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return