选取了地表温度、大气压强、风速、大气可降水量以及露点温度5个影响降水的气象要素,利用ECMWF 2000-2014年(15a)的相应气象数据,分别采用数据挖掘中C4.5决策树、随机森林对降水进行预测。随机森林的平均预测准确度达到84.36%,C4.5决策树的平均预测准确度达到82.63%,两者远高于基于信息熵的SLIQ的平均预测准确率75.11%。随机森林的平均预测准确度比C4.5决策树高1.73%,但是其建模所需的时间却远高于C4.5所需的时间,并且随着数据集的增大而增大。
Prediction of precipitation is a complex phenomenon.The variation of precipitation is affected by temperature,pressure,wind,humidity,and dew point,etc.According to the respective atmospheric data ranging from 2000 to 2014downloaded from ECMWF,this paper applies C4.5decision tree and random forest(RF)to predict the precipitation,and their accuracy reaches 82.63% and 84.36% respectively,which are much higher than SLIQ(supervised learning in quest).However,the model-constructing consumes much more time than that of C4.5with the increase of data group.