Fisher线性判别式FLDs(Fisher linear discriminates)的常用阈值对不平衡数据集分类效果较差。以不平衡数据集为应用背景,主要研究各种阈值对FLDS*类性能的影响。认为影响FLDS性能的主要是类间分布区域不平衡而不是样本数不平衡,因此提出多个经验阈值,并依据分类精度从中选择优化阈值。大量实验结果表明,所提出的阈值优化选择方法能有效提高FLDS对不平衡数据集的分类性能。
The commonly used thresholds of Fisher linear discriminant (FLD ) always have poor classification result on imbalanced datasets.On application background of the imbalanced datasets,in this paper we mainly study the influence of various thresholds on FLD’s classification performance.We argue that for FLDs,it’s the imbalance of inter-class distribution regions rather than sample sizes that mainly impacts the performance of FLDs,and thus we develop several empirical thresholds and select the optimised thresholds based on classification accuracy.Extensive experimental results show that the classification performance of FLDs on imbalanced datasets is improved effectively with the use of the proposed optimised threshold selection method.