选择了5种机器学习模型,即k最近邻方法(KNN)、多元自回归样条方法(MARS)、支持向量机(SVM)、多项对数线性模型(MLM)和人工神经网络(ANN),利用海拔、相对湿度、坡向、植被、风速、气温和坡度等因子订正ITPCAS和CMORPH两种常用的青藏高原日降水数据集。五折交叉验证表明,KNN的订正精度最高。在三个验证站点(唐古拉、西大滩和五道梁)的误差分析,以及对青藏高原年降水量的空间分析均表明,KNN对CMORPH的订正效果显著,对ITPCAS在局部区域有一定订正效果,ITPCAS及其订正值的降水空间分布准确度高于CMORPH的订正值。主成分分析法表明降水订正是气象和环境因子综合作用的结果。
In this paper,five machine learning models,namely k-nearest neighbor(KNN),multivariate adaptive regression splines(MARS),support vector machine(SVM),multinomial log-linear models(MLM) and artificial neural networks(ANN),are selected to correct two commonly used precipitation datasets,ITPCAS(Institute of Tibetan Plateau Research,Chinese Academy of Sciences) and CMORPH(climate prediction center morphing technique),over the Tibetan Plateau by establishing the relationship between daily precipitation and environmental data(elevation,slope,aspect,vegetation),as well as meteorological factors(air temperature,humidity,wind speed).The 5-fold cross validation shows that the KNN has the highest accuracy.The error analysis over the Tanggula,Xidatan and Wudaoliang Stations and the spatial analysis on annual precipitation over the plateau show that the KNN model can significantly correct the CMORPH over the plateau and the correction on the ITPCAS is significant locally.The KNN-corrected CMORPH has lower accuracy than the two ITPCAS precipitation.Principal component analysis indicates that the correction is the comprehensive effects of both environmental and meteorological factors.