二分类数据中,某些训练样本因其隐私性往往较难获取,致使训练集规模较小,因此分类算法无法学习到较好的数据模式.针对上述问题,本文利用IB方法(Information Bottleneck)并结合该问题特有的性质,提出一种新的基于单类的二分类算法——BCOC-IB算法.该算法的学习阶段使用单类IB算法学习数据模式,分类阶段使用二分类策略对测试数据进行分类.实验结果表明,当训练样本较少的情况下,BCOC-IB算法的分类精度高于对比算法,且时间复杂度较低.
In two-category data, the privacy of train samples leads to difficult extraction, so the scale of train data is small and the classification algorithm can't learn good data pattern. As to this problem, this paper combines IB method with the features of the problem, and proposes a new binary classification algorithm based on one-class information bottleneck-BCOC-IB. In learning phase of this algorithm, one-class algorithm for data pattern is used; in classification phase, dichotomy strategy is used to classify the test data. Experimental re- suits show that: when training samples become less, BCOC-IB algorithm is more outstanding than comparison algorithms, and the time complexity is lower than comparison algorithms.