接收者操作特性(Receiver operating characteristics,ROC)曲线下面积(Areaunder the ROC curve,AUC)常被用于度量分类器在整个类先验分布上的总体分类性能.原始Boosting算法优化分类精度,但在AUC度量下并非最优.提出了一种AUC优化Boosting改进算法,通过在原始Boosting迭代中引入数据重平衡操作,实现弱学习算法优化目标从精度向AUC的迁移.实验结果表明,较之原始Boosting算法,新算法在AUC度量下能获得更好性能.
The area under the receiver operating characteristics (ROC) curve (AUC) is usually used to evaluate the classifier performance over the whole class prior probability distribution. Boosting can maximize the classification accuracy, which is not optimal under the AUC measure. An improved boosting algorithm which optimizes the AUC is proposed. By introducing data rebalance operation into boosting iterations, the optimization objective of the weak learning algorithm is transferred to the AUC instead of accuracy. Experimental results show that compared with naive boosting, the new algorithm gets better performance under the AUC measure.