针对不同样本在特征空间中具有不同的区域特性和不同分类算法之间的预测互补性,在电信客户流失预测理论基础上,融合多分类器动态集成理论和成本敏感学习理论,建立了电信客户流失多分类器集成预测的利润函数,并提出了一类新的基于多分类器动态选择与成本敏感优化集成的电信客户流失预测模型。首先使用K均值聚类法聚类训练样本成多个分区;接着使用NaiveBayes算法、多层感知机算法和J48算法在各分区样本上构建客户流失预测子分类器;最后使用改进人工鱼群算法分别对各分区的子分类器进行成本敏感优化集成。实验结果表明,所提出的基于多分类器动态选择与成本敏感优化集成模型的分类性能不仅优于由训练集全体样本所构建的3个单模型,也优于基于改进人工鱼群算法优化集成这3个单模型而得到的集成模型。
On account that the different samples have the prediction complementarities between different section characters and different classification algorithms in feature space and based on the theory of Telecom customer churn prediction,this paper established the profits functions to predict Telecom customer churn integrating multi-classifiers,and a new customer churn prediction model is put forward in Telecom based on the dynamic selection and optimizing integrating of cost sensitivity.Firstly,the training set samples are clustered into multiple subareas by using K-means clustering algorithm.Then,the customer churn prediction sub-classifiers are established based on the samples in the subareas by using NaiveBayes Algorithm,Multilayer Perceptron and J48 Algorithm,respectively.Finally,the subarea sub-classifiers are integrated and optimized by use of the Improved Artificial Fish-school Algorithm(IAFSA).The experiment results show that the classifying performance of the model based on the dynamic integration of multi-classifiers and optimizing integrating of cost sensitivity not only excels the three single model constructed based on the whole samples,but also excels the model integrating of the three single model by IAFSA.