An Analysis of the Factors in Total Water Consumption Based on Random Forest Regression Algorithm: A Case Study of Guangdong Province
-
摘要: 通过构建包含人口、水资源、技术和经济4项因素和常住总人口、人口密度、水资源总量、降雨量、万元GDP用水量、万元工业增加值用水量、第一产业生产总值、第二产业生产总值和第三产业生产总值9个元素的层次评价体系,采用熵值法和随机森林回归算法,以广东省21个地级市为例,分析广东省用水总量的影响因素.研究结果表明:(1)从元素层角度分析,常住总人口、第三产业生产总值和第一产业生产总值是广东省用水总量的主要影响元素,而降雨量对广东省各地级市用水总量的影响最小;(2)从因素层角度分析,4项因素对广东省用水总量的影响由大到小依次为:经济因素、人口因素、水资源因素和技术因素;(3)综合元素层和因素层的分析,在人口、水资源、技术、经济因素中,影响广东省用水总量最大的元素分别为常住总人口、水资源总量、万元工业增加值用水量和第三产业生产总值.Abstract: A hierarchical evaluation system is constructed, including four factors (i.e., population, water resources, technology and economy) and nine elements (i.e., total resident population, population density, total water resources, rainfall, water consumption per 10 000 yuan of GDP, water consumption per 10 000 yuan of industrial added value, gross product of the primary industry, gross product of the secondary industry and gross product of the tertiary industry). The entropy method and the random forest regression algorithm are adopted to analyze the factors in the total water consumption in 21 prefecture-level cities in Guangdong Province. Three major results are obtained. First, in the element perspective, the total resident population, the gross product of the tertiary industry and the gross product of the primary industry are the main elements in the total water consumption in Guangdong Pro-vince, while rainfall has the least influence on the total water consumption of the prefecture-level cities in Guangdong Province. Second, in the factor perspective, the influence of the four factors on the total water consumption in Guangdong Province is in descending order: economic factors, population factors, water resources factors and technical factors. Third, based on the element and factor analysis, it can be seen that among the factors of population, water resources, technology and economic, the biggest elements that affect the total water consumption of Guangdong Province are the total resident population, total water resources, water consumption of 10 000 yuan per industrial added value and the gross product of the tertiary industry.
-
Keywords:
- entropy /
- random forest regression algorithm /
- total water consumption /
- factors
-
-
表 1 用水总量影响因素
Table 1 The factors for total water consumption
因素层 元素层 表征 人口因素(A1) 常住总人口(A11/万人) 正向元素,表征一个地区统计期内常住总人口数量. 人口密度(A12/(人·km-2)) 正向元素,表征一个地区统计期内相同面积下人口数量. 水资源因素(A2) 水资源总量(A21/亿m3) 正向元素,表征一个地区统计期内水资源总量. 降雨量(A22/mm) 正向元素,表征一个地区统计期内降水量的多少. 技术因素(A3) 万元GDP用水量(A31/(m3·万元-1)) 负向元素,表征产生同样的GDP所用的水量. 万元工业增加值用水量(A32/(m3·万元-1)) 负向元素,表征同样的工业增加值所用的水量. 经济因素(A4) 第一产业生产总值(A41/万元) 正向元素,表征一个地区统计期内第一产业生产总值. 第二产业生产总值(A42/万元) 正向元素,表征一个地区统计期内第二产业生产总值. 第三产业生产总值(A43/万元) 正向元素,表征一个地区统计期内第三产业生产总值. 注:A31和A32为负向元素,即元素值越大,因素层的值越小;其余为正向元素,即元素值越大,因素层的值越大. 标注因素正负的目的是为下文用熵值法将因素层用元素层的数值进行量化. 表 2 各元素熵值及权重
Table 2 The entropy value and weight of each element
指标 人口因素(A1) 水资源因素(A2) 技术因素(A3) 经济因素(A4) A11 A12 A21 A22 A31 A32 A41 A42 A43 熵值 0.871 7 0.768 4 0.872 3 0.921 0 0.954 0 0.955 2 0.894 1 0.738 1 0.657 0 权重/% 35.64 64.36 61.78 38.22 50.69 49.31 14.89 36.86 48.26 -
[1] LAM K L, LANT P A, O'BRIEN K R, et al. Comparison of water-energy trajectories of two major regions experiencing water shortage[J]. Journal of Environment Management, 2016, 181: 403-412. http://www.ncbi.nlm.nih.gov/pubmed/27395015
[2] 魏孟露. 节水型社会建设效果评估——以上海市闵行区为例[J]. 能源与节能, 2013(12): 104-105. doi: 10.3969/j.issn.2095-0802.2013.12.046 WEI M L. An effect evaluation of a water-saving society——taking Minhang District Shanghai for example[J]. Energy and Energy Conservation, 2013(12): 104-105. doi: 10.3969/j.issn.2095-0802.2013.12.046
[3] 张志红. 保定市徐水区工业节水思路、措施与效果[J]. 河北水利, 2020(1): 22-23. https://www.cnki.com.cn/Article/CJFDTOTAL-HBLS202001013.htm [4] 梁振东, 何晓静, 方红远. 基于聚类线性回归法的区域用水量影响因素分析[J]. 海河水利, 2016(3): 32-36;42. doi: 10.3969/j.issn.1004-7328.2016.03.012 LIANG Z D, HE X J, FANG H Y. Analysis on impacting factors of regional water resources utilization based on clusterwise linear regression method[J]. Haihe Water Resources, 2016(3): 32-36;42. doi: 10.3969/j.issn.1004-7328.2016.03.012
[5] 张陈俊, 章恒全, 陈其勇, 等. 中国用水量变化的影响因素分析——基于LMDI方法[J]. 资源科学, 2016, 38(7): 1308-1322. https://www.cnki.com.cn/Article/CJFDTOTAL-ZRZY201607012.htm ZHANG C J, ZHANG H Q, CHEN Q Y, et al. Factors influencing water use changes based on LMDI methods[J]. Resources Science, 2016, 38(7): 1308-1322. https://www.cnki.com.cn/Article/CJFDTOTAL-ZRZY201607012.htm
[6] 成晋松, 吕惠进, 刘玲. 太原市用水量影响因素的灰色关联分析[J]. 水资源与水工程学报, 2012, 23(2): 109-111;115. https://www.cnki.com.cn/Article/CJFDTOTAL-XBSZ201202029.htm CHENG J S, LV H J, LIU L. Grey relational analysis of influence factors on water consumption in Taiyuan City[J]. Journal of Water Resources and Water Engineering, 2012, 23(2): 109-111;115. https://www.cnki.com.cn/Article/CJFDTOTAL-XBSZ201202029.htm
[7] 张标, 刘秀丽. 我国用水量变动影响因素的结构分解分析[J]. 管理评论, 2015(5): 3-8. https://www.cnki.com.cn/Article/CJFDTOTAL-ZWGD201505002.htm ZHANG B, LIU X L. Structural decomposition analysis of impacting factors of China's water consumption changes[J]. Business Review, 2015(5): 3-8. https://www.cnki.com.cn/Article/CJFDTOTAL-ZWGD201505002.htm
[8] BREIMAN L. Random forests[J]. Machine Learning, 2001, 45(1): 5-32. doi: 10.1023/A:1010933404324
[9] 崔东文, 金波. 基于随机森林回归算法的水生态文明综合评价[J]. 水利水电科技进展, 2014, 34(5): 56- 60;79. https://www.cnki.com.cn/Article/CJFDTOTAL-SLSD201405012.htm CUI D W, JIN B. Comprehensive evaluation of water ecological civilization based on random forests regression algorithm[J]. Advances in Science and Technology of Water Resources, 2014, 34(5): 56-60;79. https://www.cnki.com.cn/Article/CJFDTOTAL-SLSD201405012.htm
[10] 赖成光, 陈晓宏, 赵仕威, 等. 基于随机森林的洪灾风险评价模型及其应用[J]. 水利学报, 2015, 46(1): 58-66. https://www.cnki.com.cn/Article/CJFDTOTAL-SLXB201501010.htm LAI C G, CHEN X H, ZHAO S W, et al. A flood risk assessment model based on Random Forest and its application[J]. Journal of Hydraulic Engineering, 2015, 46 (1): 58-66. https://www.cnki.com.cn/Article/CJFDTOTAL-SLXB201501010.htm
[11] 张冰, 周步祥, 石敏. 基于灰色关联分析与随机森林回归模型的短期负荷预测[J]. 水电能源科学, 2017(4): 203-207. https://www.cnki.com.cn/Article/CJFDTOTAL-SDNY201704051.htm ZHANG B, ZHOU B X, SHI M. Short-term load forecasting based on grey correlation analysis and random forest regression model[J]. Water Resources and Power, 2017(4): 203-207. https://www.cnki.com.cn/Article/CJFDTOTAL-SDNY201704051.htm
[12] GRAY K R, ALJABAR P, HECKEMANN R A, et al. Random forest-based similarity measures for multi-modal classification of Alzheimer's disease[J]. Neuroimage, 2013, 65: 167-175. doi: 10.1016/j.neuroimage.2012.09.065
[13] STROBL C, BOULESTEIX A L, ZEILEIS A, et al. Bias in random forest variable importance measures: illustrations, sources and a solution[J]. BMC Bioinformatics, 2007, 8(1): 1-21. doi: 10.1186/1471-2105-8-1
[14] 白鹏飞, 安琪, Nicolaas Frans de ROOIJ, 等. 基于多模型融合的互联网信贷个人信用评估方法[J]. 华南师范大学学报(自然科学版), 2017, 49(6): 119-123. doi: 10.6054/j.jscnun.2017170 BAI P F, AN Q, DE ROOIJ N F, et al. Internet credit personal credit assessing method based on multi-model ensemble[J]. Journal of South China Normal University(Natural Science Edition), 2017, 49(6): 119-123. doi: 10.6054/j.jscnun.2017170
[15] 广东省水利厅. 水资源公报(2018)[EB/OL]. (2019-07-02)[2020-08-13]. http://slt.gd.gov.cn/gs2018/content/post_2528678.html. [16] LIAW A, WIENER M. Classification and regression by random forest[J]. R News, 2002, 2(3): 18-22. http://www.mendeley.com/catalog/classification-regression-randomforest/
[17] 武晓岩, 李康. 基因表达数据判别分析的随机森林方法[J]. 中国卫生统计, 2006, 23(6): 491-494. doi: 10.3969/j.issn.1002-3674.2006.06.004 WU X Y, LI K. The application of random forests for the classification of gene expression data[J]. Chinese Journal of Health Statistics, 2006, 23(6): 491-494. doi: 10.3969/j.issn.1002-3674.2006.06.004
[18] 杨沐晞. 基于随机森林模型的二手房价格评估研究[D]. 长沙: 中南大学, 2012. YANG M X. The price evaluation research of second-hand house based on the random forest model[D]. Changsha: Cenrtal South University, 2012.
[19] 方匡南, 吴见彬, 朱建平, 等. 随机森林方法研究综述[J]. 统计与信息论坛, 2011, 26(3): 32-38. https://www.cnki.com.cn/Article/CJFDTOTAL-TJLT201103007.htm FANG K N, WU J B, ZHU J P, et al. A review of technolo-gies on random forests[J]. Statistics & Information Forum, 2011, 26(3): 32-38. https://www.cnki.com.cn/Article/CJFDTOTAL-TJLT201103007.htm
[20] 梁慧玲, 林玉蕊, 杨光, 等. 基于气象因子的随机森林算法在塔河地区林火预测中的应用[J]. 林业科学, 2016, 52(1): 89-98 https://www.cnki.com.cn/Article/CJFDTOTAL-LYKE201601011.htm LIANG H L, LIN Y R, YANG G, et al. Application of random forest algorithm on the forest fire prediction in Tahe area based on meteorological factors[J]. Forestry Science, 2016, 52(1): 89-98. https://www.cnki.com.cn/Article/CJFDTOTAL-LYKE201601011.htm
[21] 袁久和, 祁春节. 基于熵值法的湖南省农业可持续发展能力动态评价[J]. 长江流域资源与环境, 2013, 22(2): 152-157. https://www.cnki.com.cn/Article/CJFDTOTAL-CJLY201302005.htm YUAN J H, QI C J. Dynamic assessment of regional agricultural sustainability of human province based on entropy method[J]. Resources and Environment in the Yangtze Basin, 2013, 22(2): 152-157. https://www.cnki.com.cn/Article/CJFDTOTAL-CJLY201302005.htm
[22] 郭显光. 改进的熵值法及其在经济效益评价中的应用[J]. 系统工程理论与实践, 1998, 18(12): 98-102. https://www.cnki.com.cn/Article/CJFDTOTAL-XTLL812.018.htm GUO X G. Application of improved entropy method in evaluation of economic result[J]. Systems Engineering Theory & Practice, 1998, 18(12): 98-102. https://www.cnki.com.cn/Article/CJFDTOTAL-XTLL812.018.htm
[23] 吴丹, 朱玉春. 基于随机森林方法的农村公共产品供给能力影响因素分析——以农田水利基础设施为例[J]. 财贸研究, 2012, 23(2): 39-44. https://www.cnki.com.cn/Article/CJFDTOTAL-CMYJ201202009.htm WU D, ZHU Y C. Influence factors on supply capability of rural public goods based on random forest: taking irrigation and water conservancy as an example[J]. Finance and Trade Research, 2012, 23(2): 39-44. https://www.cnki.com.cn/Article/CJFDTOTAL-CMYJ201202009.htm
[24] 国家统计局. 中国统计年鉴(1999—2020)[EB/OL]. (2020-02-28)[2020-08-13]. http://www.stats.gov.cn/tjsj/ndsj/. [25] 金巍, 章恒全, 张洪波, 等. 城镇化进程中人口结构变动对用水量的影响[J]. 资源科学, 2018, 40(4): 784-796. https://www.cnki.com.cn/Article/CJFDTOTAL-ZRZY201804012.htm JIN W, ZHANG H Q, ZHANG H B, et al. The influence of population structural change on water consumption in urbanization[J]. Resources Science, 2018, 40(4): 784-796. https://www.cnki.com.cn/Article/CJFDTOTAL-ZRZY201804012.htm
[26] KUNDZEWICZ Z W, KRYSANOVA V, BENESTAD R E, et al. Uncertainty in climate change impacts on water resources[J]. Environmental Science & Policy, 2018, 79: 1-8. http://www.sciencedirect.com/science/article/pii/S146290111730638X
[27] FAN L X, GAI L T, TONG Y, et al. Urban water consumption and its influencing factors in China: evidence from 286 cities[J]. Journal of Cleaner Production, 2017, 166: 124-133. http://www.sciencedirect.com/science/article/pii/S0959652617317602
[28] 广东省水利厅. 粤水资讯[EB/OL]. (2020-03-20)[2020-08-13]. http://slt.gd.gov.cn/yszx/. [29] 广东统计信息网. 广东统计年鉴2019年[EB/OL]. (2019-09-29)[2020-08-13]. http://stats.gd.gov.cn/gdtjnj/content/post_2639622.html. -
期刊类型引用(0)
其他类型引用(9)