k-近邻方法基于单一k值预测,无法兼顾不同实例可能存在的特征差异,总体预测精度难以保证.针对该问题,提出了一种基于Bagging的组合k-NN预测模型,并在此基础上实现了具有属性选择的Bgk-NN预测方法.该方法通过训练建立个性化预测模型集合,各模型独立生成未知实例预测值,并以各预测值的中位数作为组合预测结果.Bgk-NN预测可适用于包含离散值属性及连续值属性的各种类型数据集.标准数据集上的实验表明,Bgk-NN预测精度较之传统k-NN方法有了明显提高.
The existing k-nearest neighbor (k-NN) algorithm predicts in terms of a fixed single k value without considering the diversity of various unknown instances,thus the prediction perpformance can hardly be ensured.Therefore,both an ensemble model of k-NN prediction based on bagging principle and a Bgk-NN prediction algorithm with attributes selection are proposed in this paper.In the novel Bgk-NN algorithm,a set of diverse base k-NN predictors are trained,and the unknown instance is predicted independently by those k-NN predictors.Consequently,the median of all the predicted values is calculated to be the final result of the ensemble model.The Bgk-NN algorithm can fit well all the datasets with no matter discrete or continuous attributes.The experimental results on standard datasets show that,compared with the traditional k-NN predictor,the prediction accuracy of Bgk-NN can be improved effectively.