• Overview of Chinese core journals
  • Chinese Science Citation Database(CSCD)
  • Chinese Scientific and Technological Paper and Citation Database (CSTPCD)
  • China National Knowledge Infrastructure(CNKI)
  • Chinese Science Abstracts Database(CSAD)
  • JST China
  • SCOPUS
SUN Yuqing, HUANG Tian, LI Chengtao, ZHENG Wei, TANG Yong. Survey on Text Semantic Hashing Technology[J]. Journal of South China Normal University (Natural Science Edition), 2024, 56(3): 93-105. DOI: 10.6054/j.jscnun.2024041
Citation: SUN Yuqing, HUANG Tian, LI Chengtao, ZHENG Wei, TANG Yong. Survey on Text Semantic Hashing Technology[J]. Journal of South China Normal University (Natural Science Edition), 2024, 56(3): 93-105. DOI: 10.6054/j.jscnun.2024041

Survey on Text Semantic Hashing Technology

  • Text semantic hashing refers to the neural techniques that encode texts into low-dimensional binary codes under the semantic similarity constraints. Since the hashing codes support the Hamming distance-based retrieval, it is efficient to compute the text similarity on massive data. There are many challenges on the text semantic hashing technologies, such as how to embed the category information into low-dimensional binary codes, how to enrich the semantic information to improve model robustness and how to optimize the model for the discrete coding space. The important progresses on the text semantic hashing techniques are firstly reviewed, and the technical details of methods are discussed, including the unsupervised text semantic hashing models with text reconstruction and the supervised text semantic hashing models with integrating categorical information. Additionally, the key techniques such as semantic enhancement techniques based on neighbor information and latent topic information and model optimization techniques are analyzed. The datasets on text semantic hashing and the evaluation metrics related to the text semantic hashing task are also summarized, based on which the performances of different text semantic hashing methods are compared. Finally, the future research directions are discussed.
  • loading

Catalog

    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return