在人口分布及其相关研究中,常常会遇到小尺度人口数据部分缺失的问题。本文以湖北省鹤峰县为例,在分析土地利用与人口分布关系的基础上,从全局与局部、线性回归与非线性回归考虑,基于土地利用类型,分别利用地理加权回归(GWR)方法、格网方法、BP神经网络方法对缺失数据的行政村人口数据进行模拟,并进行了多角度精度对比验证。研究结果表明:(1)各种土地利用类型中,耕地、林地、城镇村及工矿用地、交通用地是影响研究区村级人15分布的主要因素;(2)30个调查村中,3种方法模拟的人口总数误差小于3%,通过每个村的模拟值与实际值相比,BP神经网络方法能更好地模拟研究区村级人口的分布,格网方法次之,GWR方法最差;(3)研究区各村人口分布呈现较高的空间正相关性,各乡镇的人口密度在空间上并不独立,而是呈现紧密的集聚特征。
The problem that population data is usually missing in small scale areas such as administrative villag- es which are always mentioned in population distribution studies and related researches. In this context, we took the Hefeng County in Hubei Province as the study area and analyzed the correlation between land use type index and population density. The simulation of the village-level population distribution is performed using Geographi- cally Weighted Regression (GWR) method, grid method and BP neural network method respectively. Then, from the perspective of global-local and linear-nonlinear, the comparative precision validation was taken to verify the simulation accuracy of the population in villages with missing population data, which utilizes cross-validation method between the simulated population and the actual population. Results show that: (1) in all kinds of land use types, the main factors affecting population distribution are farmland, woodland, urban industrial land, and transportation land; (2) with regard to the three simulation methods we concerned, the errors of the simulated to- tal population using these methods are all less than 3% for the 30 invested villages. By comparing the ratios of estimated values to the actual values of population in each village, and taking 10% as the tolerance, the reliability of GWR method is 43.33%, while grid method is 76.67% and BP neural network is 86.67 %. It shows that the BP neural network method is the optimal method among the three methods for the study area, and grid method is better than GWR method. In addition, the prediction accuracy of nonlinear regression is higher than that of linear regression; (3) population spatial distribution in the study area shows a high spatial positive correlation and a "high - high" agglomeration type which is also the main type in the study area; moreover, it shows that the popu- lation densities of the county are not spatially independent but intensively agglomerated.