自然语言词性序列的分类

The Classification of Lexical Category Sequence of Natural Human Language

  • 摘要: 采集142份主题句作业自然语言语料数据,利用中文自然语言处理平台构造自然语言的词性序列;经过语言结构粗粒化处理,建构由名词、动词、形容词和代词等4种实词构成的词性序列分类模型.研究结果显示,基于词性含量的自然语言词性序列分类模型的准确率达到90%;基于词序位置的自然语言词性序列的分类模型的准确率达到了95%.研究结论表明,自然语言的词性序列分类模型在语言认知领域具有较好的应用价值,不仅可以揭示和证实语言与心理信息之间存在的相关关系,而且可以通过客观的语言符号对内隐的心理信息做出科学的评估.

     

    Abstract: After collecting 142 sample datasets from a topic sentence natural language corpus, lexical category sequencing of natural language is constructed by a Chinese natural language processing platform. The lexical category sequence classification model is composed of noun, verb, adjective and pronoun after the coarse graining processing of language structure. The results show that the lexical category sequence classification model based on the lexical category content do archive the accuracy of 90%, and the same model based on the position of word order do archive the accuracy of 95%. Thus, it proves that the lexical category sequence of natural language classification model would be valuable in linguistic cognitive. It not only reveals the relationship between language and psychology Information, but also assesses the implicit psychology information scientifically through the objective linguistic notations.

     

/

返回文章
返回