针对三元编码矩阵中基分类器不包含被忽略样本类别先验知识的问题,该文提出一种基于接收机工作特性(ROC)曲线的矩阵再编码方法。首先基于ROC曲线寻找构造拒绝域的阈值对,从而获得最优分类器;然后利用最优分类器对训练样本中被忽略的类别进行分类,将经典的二值输出变为三值输出,从而对初始编码矩阵的码元“0”进行重新编码。在解码阶段,采用经典的汉明距离解码方法对未知样本进行决策。该方法能够避免基分类器的二次训练,适用于任意的三元纠错输出编码,具有良好的普适性和实用性。基于人工和UCI公共数据集的实验结果表明该方法简单高效,在不增加训练时间的基础上,能够提高解码的速度和精度,促进分类效果的提升。
As to the problem that the base classifiers in ternary Error Correcting Output Codes (ECOC) matrix do not contain the prior information of classes which are ignored in binary splits, a new recoding ECOC based on Receiver Operating Characteristic (ROC) curve is presented. To recode the ternary matrix, the two thresholds of reject region are obtained based on ROC to build the optimal classifiers. Then, the optimal classifiers are used to classify the ignored classes based on bipartition in training phase. In so doing, the classical two-symbol output expands to three-symbol to recode the zeros. Finally, the Hamming decoding strategy is adopted for decision in decoding. This method can avoid a second training and is applied to any kind of ternary matrix. The experiments based on Synthetic and UCI datasets validate the better efficiency and remarkable promotion without increasing training complexity of the proposed approach.