针对传统分类算法隐含的假设(相信并且接受每个样本的分类结果)在医疗/故障诊断和欺诈/入侵检测等领域中并不适用的问题,提出嵌入非对称拒识代价的二元分类问题,并对其进行简化.在此基础上设计出基于支持向量机(SVM)的代价敏感分类算法(CSVM-CRC).该算法包括训练SVM分类器、计算后验概率、估计分类可靠性和确定最优拒识阈值4个步骤.基于10个Benchmark数据集的实验研究表明,CSVM-CRC算法能够有效降低平均代价.
@@@@To minimize “0-1” loss, most of conventional classification algorithms non-explicitly assume that all results of classification are accepted. However, the assumption is inapplicability to knowledge extraction in such fields as medical/fault diagnosis and fraud/intrusion detection. Therefore, the binary classification problem with class-dependent reject cost(BCP-CRC) is summarized and is simplified, on basis of which the algorithm based on cost-sensitive support vector machines with CRC(CSVM-CRC) is formulated. The CSVM-CRC algorithm involves training a classifier based on SVM algorithm, computing the post probability of each sample, estimating the classification reliability of each sample, and determining the optimal reject threshold. The experiment results show that the CSVM-CRC algorithm can reduce the average cost effectively.