TransE和TransH模型空间地址表示学习中的对比研究——以广州市天河区为例

王昕, 李卫红, 童昊昕

王昕, 李卫红, 童昊昕. TransE和TransH模型空间地址表示学习中的对比研究——以广州市天河区为例[J]. 华南师范大学学报(自然科学版), 2020, 52(4): 86-94. DOI: 10.6054/j.jscnun.2020065
引用本文: 王昕, 李卫红, 童昊昕. TransE和TransH模型空间地址表示学习中的对比研究——以广州市天河区为例[J]. 华南师范大学学报(自然科学版), 2020, 52(4): 86-94. DOI: 10.6054/j.jscnun.2020065
WANG Xin, LI Weihong, TONG Haoxin. Comparing TransE and TransH Algorithms in Spatial Address Representation Learning: A Case Study of Tianhe District, Guangzhou City[J]. Journal of South China Normal University (Natural Science Edition), 2020, 52(4): 86-94. DOI: 10.6054/j.jscnun.2020065
Citation: WANG Xin, LI Weihong, TONG Haoxin. Comparing TransE and TransH Algorithms in Spatial Address Representation Learning: A Case Study of Tianhe District, Guangzhou City[J]. Journal of South China Normal University (Natural Science Edition), 2020, 52(4): 86-94. DOI: 10.6054/j.jscnun.2020065

TransE和TransH模型空间地址表示学习中的对比研究——以广州市天河区为例

基金项目: 

广东省科技计划项目 2017B030305005

详细信息
    通讯作者:

    李卫红,教授,Email:hongweili9981@163.com

  • 中图分类号: TP182;P28

Comparing TransE and TransH Algorithms in Spatial Address Representation Learning: A Case Study of Tianhe District, Guangzhou City

  • 摘要: 将地理知识融入空间地址,研究空间信息与语义信息融合的知识表示学习方法;将空间地址数据集在TransE模型和TransH模型上进行训练,通过元组分类和向量间距离评估的方法进行对比研究.研究结果表明:(1)在地址实体的表示学习任务中,TransH模型在对复杂关系的建模任务上明显优于TransE模型;(2)在语义知识基础上融入空间关系,能够有效地解决地址实体语义相似而空间距离不相近和空间距离相近而语义不相似的两大问题.语义关系与空间关系的融合,将能够挖掘更多有价值的信息,有利于进一步开展地理知识图谱的补全工作,可为地理知识图谱表示学习提供方法借鉴.
    Abstract: The knowledge representation learning methods of spatial and semantic information fusion are studied by integrating geographic knowledge into spatial address. Spatial address data sets are trained on two classical translation models of TransE and TransH, and a comparative study is conducted through tuple classification and distance evaluation between vectors. The results of the study show that, in the representation learning task of address entities, the TransH algorithm model is significantly better than the TransE algorithm model in modeling complex relationships. The integration of semantic knowledge and spatial relationship can effectively solve the problems of address entities lacking correspondence between semantic similarity and spatial distance similarity. The fusion of the semantic relationship and the spatial relationship will be able to reveal more valuable information, help to complete the geographic knowledge graph and provide reference for the study of the geographic knowledge graph.
  • 图  1   TransE模型核心思想图

    Figure  1.   The core idea map of the TransE

    图  2   TransH模型核心思想图

    Figure  2.   The core idea map of the TransH

    图  3   地址知识图谱展示

    Figure  3.   The address knowledge graph

    图  4   TransE模型和TransH模型的训练流程图

    Figure  4.   The flow chart of TransE and TransH training

    图  5   TransE模型中实体间的向量距离

    Figure  5.   The vector distance between entities of the TransE

    图  6   TransH模型中实体间的向量距离

    Figure  6.   The vector distance between entities of the TransH

    表  1   地址层级关系示例

    Table  1   The instances of address hierarchy

    样本编号 原始地址 地址层级关系
    1 广东省广州市天河区石牌街道办 省-市-区-街道
    2 广州市天河区林和街天誉社区居委会 市-区-乡镇-村
    3 广东省广州市中山大道华南师范大学 省-市-街道-兴趣点
    下载: 导出CSV

    表  2   数据规模

    Table  2   The data scale  

    类型 实体 关系
    街道 居村委 信息点 街道关系 居村委关系 临近关系
    数量 40 206 118 807 118 807 118 807 1 183 155
    下载: 导出CSV

    表  3   三元组中实体间的向量距离

    Table  3   The vector distance between entities of a triple

    模型 TransE TransH
    正例三元组中实体间的向量距离(h+r-t) 0.531 0.525
    负例三元组中实体间的向量距离(h′+r-t′) 1.388 1.379
    正负例距离差 0.857 0.854
    下载: 导出CSV
  • [1] 李德仁, 邵振峰.论新地理信息时代[J].中国科学:F辑, 2009, 39(6):579-587. doi: 10.1360/zf2009-39-6-579
    [2] 龚健雅, 王国良.从数字城市到智慧城市:地理信息技术面临的新挑战[J].测绘地理信息, 2013(2):11-16. http://www.cnki.com.cn/Article/CJFDTotal-CHXG201302006.htm

    GONG J Y, WANG G L. From digital city to smart city:new challenges to geographic information technology[J]. Journal of Geomatics, 2013(2):11-16. http://www.cnki.com.cn/Article/CJFDTotal-CHXG201302006.htm

    [3] 于焕菊, 李云岭, 齐清文.顾及实体空间关系的地址编码方法研究[J].地理与地理信息科学, 2013(5):53-56;81. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=dlxygtyj201305011

    YU H J, LI Y L, QI Q W. Address geocoding method based on spatial entity relationships[J]. Geography and Geo-information Science, 2013(5):53-56;81. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=dlxygtyj201305011

    [4] 吴睿, 龙华, 熊新, 等.一种多策略结合的地址匹配算法[J].河南理工大学学报(自然科学版), 2019, 38(5):124-129. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=jzgxyxb201905018

    WU R, LONG H, XIONG X, et al. A multi-strategy combined address matching algorithm[J]. Journal of Henan Polytechnic University(Natural Science), 2019, 38(5):124-129. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=jzgxyxb201905018

    [5] 张建英, 刘高.地理实体与政务专题数据关联融合方式研究[J].城市勘测, 2018(4):25-28. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=cskc201804006

    ZHANG J Y, LIU G. Study of the combination of geographic entity and government affairs data[J]. Urban Geotechnical Investigation & Surveying, 2018(4):25-28. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=cskc201804006

    [6]

    PUJARA J, MIAO H, GETOOR J, et al. Knowledge graph identification[C]//Proceedings of International Semantic Web Conference. Berlin: Springer, 2013: 542-557.

    [7] 黄恒琪, 于娟, 廖晓, 等.知识图谱研究综述[J].计算机系统应用, 2019, 28(6):1-12. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=jsjxtyy201906002

    HUANG H Q, YU J, LIAO X, et al. Review on knowledge graphs[J]. Computer Systems & Applications, 2019, 28(6):1-12. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=jsjxtyy201906002

    [8] 蒋秉川, 万刚, 许剑, 等.多源异构数据的大规模地理知识图谱构建[J].测绘学报, 2018(8):1051-1061. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=chxb201808005

    JIANG B C, WAN G, XU J, et al. Geographic knowledge graph building extracted from multi-sourced heterogeneous data[J]. Acta Geodaetica et Cartographica Sinica, 2018(8):1051-1061. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=chxb201808005

    [9] 张春菊, 张雪英, 王曙, 等.中文文本的事件时空信息标注[J].中文信息学报, 2016, 30(3):213-222. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=zwxxxb201603029

    ZHANG C J, ZHANG X Y, WANG S, et al. Annotation of spatial-temporal information of event in Chinese text[J]. Journal of Chinese Information Processing, 2016, 30(3):213-222. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=zwxxxb201603029

    [10] 张雪英, 张春菊, 朱少楠.中文文本的地理空间关系标注[J].测绘学报, 2012, 41(3):468-474. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=chxb201203026

    ZHANG X Y, ZHANG C J, ZHU S N. Annotation for geographical spatial relations in Chinese text[J]. Acta Geodaetica et Cartographica Sinica, 2012, 41(3):468-474. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=chxb201203026

    [11] 王姬卜, 陆锋, 吴升, 等.基于自动回标的地理实体关系语料库构建方法[J].地球信息科学学报, 2018, 20(7):871-879. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=dqxxkx201807001

    WANG J B, LU F, WU S, et al. Constructing the corpus of geographical entity relations based on automatic annotation[J]. Journal of Geo-information Science, 2018, 20(7):871-879. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=dqxxkx201807001

    [12] 高嘉良, 余丽, 仇培元, 等.基于通用知识库的地理实体开放关系过滤方法[J].地球信息科学学报, 2019, 21(9):1392-1401. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=dqxxkx201909009

    GAO J L, YU L, QIU P Y, et al. A knowledge-based method for filtering geo-entity relations[J]. Journal of Geo-information Science, 2019, 21(9):1392-1401. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=dqxxkx201909009

    [13] 余丽, 陆锋, 刘希亮, 等.稀疏地理实体关系的关键词提取方法[J].地球信息科学学报, 2016, 18(11):1465-1475. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=dqxxkx201611004

    YU L, LU F, LIU X L, et al. A method of context enhanced keyword extraction for sparse geo-entity relation[J]. Journal of Geo-information Science, 2016, 18(11):1465-1475. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=dqxxkx201611004

    [14] 陈军, 刘万增, 武昊, 等.基础地理知识服务的基本问题与研究方向[J].武汉大学学报(信息科学版), 2019, 44(1):38-47. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=whchkjdxxb201901004

    CHEN J, LIU W Z, WU H, et al. Basic issues and research agenda of geospatial knowledge service[J]. Geomatics and Information Science of Wuhan University, 2019, 44(1):38-47. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=whchkjdxxb201901004

    [15]

    JIANG B C, TAN L H, REN Y, et al. Intelligent interaction with virtual geographical environments based on geographic knowledge graph[J]. ISPRS International Journal of Geo-Information, 2019, 8:428-446. http://cn.bing.com/academic/profile?id=347f367a21b8473d3be83b6cc896fb88&encoded=0&v=paper_preview&mkt=zh-cn

    [16]

    QIAN T Y, LIU B, HUNG N Q V, et al. Spatiotemporal representation learning for translation-based POI recommendation[J]. ACM Transactions on Information Systems, 2019, 37(2):1-24. http://cn.bing.com/academic/profile?id=3644fb128c4769be4c20f39dc78fabe2&encoded=0&v=paper_preview&mkt=zh-cn

    [17]

    ZUHEROS C, TABIK S, VALDIVIA A, et al. Deep recu-rrent neural network for geographical entities disambiguation on social media data[J]. Knowledge-Based Systems, 2019, 173:117-127.

    [18] 栗永芳.面向知识图谱的表示学习研究[D].桂林: 桂林电子科技大学, 2018.

    LI Y F. Research on representation learning for knowledge graph[D]. Guilin: Guilin University of Electronic Technology, 2018.

    [19]

    BENGIO Y, COURVILLE A, VINCENT P. Representation learning:a review and new perspectives[J]. IEEE Tran-sactions on Pattern Analysis and Machine Intelligence, 2013, 35(8):1798-1828. http://cn.bing.com/academic/profile?id=c85465126b23431b49f9a58196c59bac&encoded=0&v=paper_preview&mkt=zh-cn

    [20] 姜天文, 秦兵, 刘挺.基于表示学习的开放域中文知识推理[J].中文信息学报, 2018, 32(3):34-41. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=zwxxxb201803005

    JIANG T W, QIN B, LIU T. Open domain knowledge reasoning for Chinese based on representation learning[J]. Journal of Chinese Information Processing, 2018, 32(3):34-41. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=zwxxxb201803005

    [21]

    BORDES A, GLOROT X, WESTON J, et al. A semantic matching energy function for learning with multi-relational data[J]. Machine Learning, 2014, 94(2):233-259. http://cn.bing.com/academic/profile?id=1c838644a73c73f507f672e799ae382d&encoded=0&v=paper_preview&mkt=zh-cn

    [22]

    WANG Z, ZHANG J, FENG J, et al. Knowledge graph embedding by translating on hyperplanes[C]//Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence. Québec: AAAI Press, 2014: 1112-1119.

    [23] 程博, 李卫红, 童昊昕.基于BiLSTM-CRF的中文层级地址分词[J].地球信息科学学报, 2019, 21(8):1143-1151. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=dqxxkx201908001

    CHEN B, LI W H, TONG H X. Chinese Address segmentation based on BiLSTM-CRF[J]. Journal of Geo-information Science, 2019, 21(8):1143-1151. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=dqxxkx201908001

  • 期刊类型引用(2)

    1. 郭贺媛熙,李利军,冯军,林鑫,李睿. 基于DNA杂交指示剂和银纳米棒阵列芯片构建氯霉素SERS适配体传感器的研究. 光谱学与光谱分析. 2023(11): 3445-3451 . 百度学术
    2. 赵倩雯,李南希,陈琳琳,李红. 亚甲基蓝介导抗坏血酸氧化动力学的研究. 华南师范大学学报(自然科学版). 2018(06): 25-30 . 百度学术

    其他类型引用(0)

图(6)  /  表(3)
计量
  • 文章访问数:  736
  • HTML全文浏览量:  376
  • PDF下载量:  64
  • 被引次数: 2
出版历程
  • 收稿日期:  2019-12-23
  • 网络出版日期:  2021-03-21
  • 刊出日期:  2020-08-24

目录

    /

    返回文章
    返回