针对传统贝叶斯网络模型在样本数据不充分限制下预报精度低的缺陷,引入相似性度量方法,提出一种基于Jaccard相似性系数修正的贝叶斯网络PM_(2.5)日均浓度预报模型。在传统模型缺失对应输出时,改进贝叶斯网络模型可依据相似性原理,从历史资料筛选预报日相似样本,并基于筛选出的相似样本估算预报日PM_(2.5)浓度值。以2013年长沙市3个空气质量监测点监测数据为例,运用改进模型和传统模型在各站点不同季节典型月份开展了预报实验。结果表明:改进贝叶斯网络模型相对传统贝叶斯网络模型在5月、11月、2月的预报准确率均有不同程度的提高;同一月份,各站点预报效果无显著差异;不同月份预报效果差别明显,预报准确率从高到低依次是8月、5月、11月和2月。研究证实,引入样本相似性度量手段提高传统贝叶斯网络模型在空气质量预报中的精度具有可行性。
Traditional Bayesian network models usually suffer from the low prediction accuracy due to the insufficient sample data. This study therefore proposed an improved model of Bayesian network for daily PM_(2.5) concentration prediction by introducing Jaccard Coefficient, a similarity measurement method.According to the similarity of samples between historical and forecast day, the improved Bayesian network model was enhanced by employing the historical similar samples selected to predict the PM_(2.5) concentration on the forecast day, which can not be implemented by the traditional model resulting from the lack of corresponding outputs. Taking data from three air quality monitoring sites in Changsha City in 2013 as a case,experiments were conducted to predict the PM_(2.5) concentrations in typical months of various seasons by using improved- and traditional Bayesian network models. Results showed that the improved model outperformed the traditional model in May,November, February. Although there were no significant differences existing in accuracy rate of the same month among the monitoring sites, accuracy rates varied by month with the same decreasing order as August, May, November and February. It can be concluded that introducing similarity measurement is a feasible and effective way to improve the accuracy of traditional Bayesian network models in air quality prediction.