针对神经网络算法在当前PM2.5浓度预测领域存在的易过拟合、网络结构复杂、学习效率低等问题,引入RFR(random forest regression,随机森林回归)算法,分析气象条件、大气污染物浓度和季节所包含的22项特征因素,通过调整参数的最优组合,设计出一种新的PM2.5浓度预测模型——RFRP模型。同时,收集了西安市2013--2016年的历史气象数据,进行模型的有效性实验分析。实验结果表明,RFRP模型不仅能有效预测PM2.5浓度,还能在不影响预测精度的同时,较好地提升模型的运行效率,其平均运行时间为O.281S,约为BP-NN(back propagation neural network,BP神经网络)预测模型的5.88%。
The random foreat regression algorithm was introduced to solve the shortcomings of neural network in predicting the PM2.5 concentration, such as over-fitting, complex network structure, low learning efficiency. A novel PM2.5 concentration prediction model named RFRP was designed by analyzing the 22 characteristic factors includ- ing the meteorological conditions, the concentration of air pollutants and the season. The historical meteorological data of Xi'an in 2013--2016 were collected to verify the effectiveness of the model. The experimental results show that the proposed model can not only predict the PM2.5 concentration effectively, but also improve the operating efficiency of the model without affecting the prediction accuracy. The average run time of the proposed model is 0.281 s, which is about 5.58% of the neural network prediction model.