Comparing TransE and TransH Algorithms in Spatial Address Representation Learning: A Case Study of Tianhe District, Guangzhou City
-
摘要: 将地理知识融入空间地址,研究空间信息与语义信息融合的知识表示学习方法;将空间地址数据集在TransE模型和TransH模型上进行训练,通过元组分类和向量间距离评估的方法进行对比研究.研究结果表明:(1)在地址实体的表示学习任务中,TransH模型在对复杂关系的建模任务上明显优于TransE模型;(2)在语义知识基础上融入空间关系,能够有效地解决地址实体语义相似而空间距离不相近和空间距离相近而语义不相似的两大问题.语义关系与空间关系的融合,将能够挖掘更多有价值的信息,有利于进一步开展地理知识图谱的补全工作,可为地理知识图谱表示学习提供方法借鉴.Abstract: The knowledge representation learning methods of spatial and semantic information fusion are studied by integrating geographic knowledge into spatial address. Spatial address data sets are trained on two classical translation models of TransE and TransH, and a comparative study is conducted through tuple classification and distance evaluation between vectors. The results of the study show that, in the representation learning task of address entities, the TransH algorithm model is significantly better than the TransE algorithm model in modeling complex relationships. The integration of semantic knowledge and spatial relationship can effectively solve the problems of address entities lacking correspondence between semantic similarity and spatial distance similarity. The fusion of the semantic relationship and the spatial relationship will be able to reveal more valuable information, help to complete the geographic knowledge graph and provide reference for the study of the geographic knowledge graph.
-
Keywords:
- geographic knowledge graph /
- spatial address /
- knowledge representation learning /
- TransE /
- TransH
-
-
表 1 地址层级关系示例
Table 1 The instances of address hierarchy
样本编号 原始地址 地址层级关系 1 广东省广州市天河区石牌街道办 省-市-区-街道 2 广州市天河区林和街天誉社区居委会 市-区-乡镇-村 3 广东省广州市中山大道华南师范大学 省-市-街道-兴趣点 表 2 数据规模
Table 2 The data scale
条 类型 实体 关系 街道 居村委 信息点 街道关系 居村委关系 临近关系 数量 40 206 118 807 118 807 118 807 1 183 155 表 3 三元组中实体间的向量距离
Table 3 The vector distance between entities of a triple
模型 TransE TransH 正例三元组中实体间的向量距离(h+r-t) 0.531 0.525 负例三元组中实体间的向量距离(h′+r-t′) 1.388 1.379 正负例距离差 0.857 0.854 -
[1] 李德仁, 邵振峰.论新地理信息时代[J].中国科学:F辑, 2009, 39(6):579-587. doi: 10.1360/zf2009-39-6-579 [2] 龚健雅, 王国良.从数字城市到智慧城市:地理信息技术面临的新挑战[J].测绘地理信息, 2013(2):11-16. http://www.cnki.com.cn/Article/CJFDTotal-CHXG201302006.htm GONG J Y, WANG G L. From digital city to smart city:new challenges to geographic information technology[J]. Journal of Geomatics, 2013(2):11-16. http://www.cnki.com.cn/Article/CJFDTotal-CHXG201302006.htm
[3] 于焕菊, 李云岭, 齐清文.顾及实体空间关系的地址编码方法研究[J].地理与地理信息科学, 2013(5):53-56;81. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=dlxygtyj201305011 YU H J, LI Y L, QI Q W. Address geocoding method based on spatial entity relationships[J]. Geography and Geo-information Science, 2013(5):53-56;81. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=dlxygtyj201305011
[4] 吴睿, 龙华, 熊新, 等.一种多策略结合的地址匹配算法[J].河南理工大学学报(自然科学版), 2019, 38(5):124-129. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=jzgxyxb201905018 WU R, LONG H, XIONG X, et al. A multi-strategy combined address matching algorithm[J]. Journal of Henan Polytechnic University(Natural Science), 2019, 38(5):124-129. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=jzgxyxb201905018
[5] 张建英, 刘高.地理实体与政务专题数据关联融合方式研究[J].城市勘测, 2018(4):25-28. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=cskc201804006 ZHANG J Y, LIU G. Study of the combination of geographic entity and government affairs data[J]. Urban Geotechnical Investigation & Surveying, 2018(4):25-28. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=cskc201804006
[6] PUJARA J, MIAO H, GETOOR J, et al. Knowledge graph identification[C]//Proceedings of International Semantic Web Conference. Berlin: Springer, 2013: 542-557.
[7] 黄恒琪, 于娟, 廖晓, 等.知识图谱研究综述[J].计算机系统应用, 2019, 28(6):1-12. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=jsjxtyy201906002 HUANG H Q, YU J, LIAO X, et al. Review on knowledge graphs[J]. Computer Systems & Applications, 2019, 28(6):1-12. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=jsjxtyy201906002
[8] 蒋秉川, 万刚, 许剑, 等.多源异构数据的大规模地理知识图谱构建[J].测绘学报, 2018(8):1051-1061. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=chxb201808005 JIANG B C, WAN G, XU J, et al. Geographic knowledge graph building extracted from multi-sourced heterogeneous data[J]. Acta Geodaetica et Cartographica Sinica, 2018(8):1051-1061. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=chxb201808005
[9] 张春菊, 张雪英, 王曙, 等.中文文本的事件时空信息标注[J].中文信息学报, 2016, 30(3):213-222. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=zwxxxb201603029 ZHANG C J, ZHANG X Y, WANG S, et al. Annotation of spatial-temporal information of event in Chinese text[J]. Journal of Chinese Information Processing, 2016, 30(3):213-222. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=zwxxxb201603029
[10] 张雪英, 张春菊, 朱少楠.中文文本的地理空间关系标注[J].测绘学报, 2012, 41(3):468-474. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=chxb201203026 ZHANG X Y, ZHANG C J, ZHU S N. Annotation for geographical spatial relations in Chinese text[J]. Acta Geodaetica et Cartographica Sinica, 2012, 41(3):468-474. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=chxb201203026
[11] 王姬卜, 陆锋, 吴升, 等.基于自动回标的地理实体关系语料库构建方法[J].地球信息科学学报, 2018, 20(7):871-879. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=dqxxkx201807001 WANG J B, LU F, WU S, et al. Constructing the corpus of geographical entity relations based on automatic annotation[J]. Journal of Geo-information Science, 2018, 20(7):871-879. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=dqxxkx201807001
[12] 高嘉良, 余丽, 仇培元, 等.基于通用知识库的地理实体开放关系过滤方法[J].地球信息科学学报, 2019, 21(9):1392-1401. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=dqxxkx201909009 GAO J L, YU L, QIU P Y, et al. A knowledge-based method for filtering geo-entity relations[J]. Journal of Geo-information Science, 2019, 21(9):1392-1401. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=dqxxkx201909009
[13] 余丽, 陆锋, 刘希亮, 等.稀疏地理实体关系的关键词提取方法[J].地球信息科学学报, 2016, 18(11):1465-1475. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=dqxxkx201611004 YU L, LU F, LIU X L, et al. A method of context enhanced keyword extraction for sparse geo-entity relation[J]. Journal of Geo-information Science, 2016, 18(11):1465-1475. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=dqxxkx201611004
[14] 陈军, 刘万增, 武昊, 等.基础地理知识服务的基本问题与研究方向[J].武汉大学学报(信息科学版), 2019, 44(1):38-47. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=whchkjdxxb201901004 CHEN J, LIU W Z, WU H, et al. Basic issues and research agenda of geospatial knowledge service[J]. Geomatics and Information Science of Wuhan University, 2019, 44(1):38-47. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=whchkjdxxb201901004
[15] JIANG B C, TAN L H, REN Y, et al. Intelligent interaction with virtual geographical environments based on geographic knowledge graph[J]. ISPRS International Journal of Geo-Information, 2019, 8:428-446. http://cn.bing.com/academic/profile?id=347f367a21b8473d3be83b6cc896fb88&encoded=0&v=paper_preview&mkt=zh-cn
[16] QIAN T Y, LIU B, HUNG N Q V, et al. Spatiotemporal representation learning for translation-based POI recommendation[J]. ACM Transactions on Information Systems, 2019, 37(2):1-24. http://cn.bing.com/academic/profile?id=3644fb128c4769be4c20f39dc78fabe2&encoded=0&v=paper_preview&mkt=zh-cn
[17] ZUHEROS C, TABIK S, VALDIVIA A, et al. Deep recu-rrent neural network for geographical entities disambiguation on social media data[J]. Knowledge-Based Systems, 2019, 173:117-127.
[18] 栗永芳.面向知识图谱的表示学习研究[D].桂林: 桂林电子科技大学, 2018. LI Y F. Research on representation learning for knowledge graph[D]. Guilin: Guilin University of Electronic Technology, 2018.
[19] BENGIO Y, COURVILLE A, VINCENT P. Representation learning:a review and new perspectives[J]. IEEE Tran-sactions on Pattern Analysis and Machine Intelligence, 2013, 35(8):1798-1828. http://cn.bing.com/academic/profile?id=c85465126b23431b49f9a58196c59bac&encoded=0&v=paper_preview&mkt=zh-cn
[20] 姜天文, 秦兵, 刘挺.基于表示学习的开放域中文知识推理[J].中文信息学报, 2018, 32(3):34-41. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=zwxxxb201803005 JIANG T W, QIN B, LIU T. Open domain knowledge reasoning for Chinese based on representation learning[J]. Journal of Chinese Information Processing, 2018, 32(3):34-41. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=zwxxxb201803005
[21] BORDES A, GLOROT X, WESTON J, et al. A semantic matching energy function for learning with multi-relational data[J]. Machine Learning, 2014, 94(2):233-259. http://cn.bing.com/academic/profile?id=1c838644a73c73f507f672e799ae382d&encoded=0&v=paper_preview&mkt=zh-cn
[22] WANG Z, ZHANG J, FENG J, et al. Knowledge graph embedding by translating on hyperplanes[C]//Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence. Québec: AAAI Press, 2014: 1112-1119.
[23] 程博, 李卫红, 童昊昕.基于BiLSTM-CRF的中文层级地址分词[J].地球信息科学学报, 2019, 21(8):1143-1151. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=dqxxkx201908001 CHEN B, LI W H, TONG H X. Chinese Address segmentation based on BiLSTM-CRF[J]. Journal of Geo-information Science, 2019, 21(8):1143-1151. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=dqxxkx201908001
-
期刊类型引用(2)
1. 郭贺媛熙,李利军,冯军,林鑫,李睿. 基于DNA杂交指示剂和银纳米棒阵列芯片构建氯霉素SERS适配体传感器的研究. 光谱学与光谱分析. 2023(11): 3445-3451 . 百度学术
2. 赵倩雯,李南希,陈琳琳,李红. 亚甲基蓝介导抗坏血酸氧化动力学的研究. 华南师范大学学报(自然科学版). 2018(06): 25-30 . 百度学术
其他类型引用(0)