针对现有基于超平面的单类分类器未同时考虑目标数据全局与局部信息的不足,通过在单类支持向量机One—Class SVM(OCSVM)算法中加入类内散度以反应目标数据的全局信息,提出了结构化单类支持向量机Structured OCSVM(SOCSVM),不仅使之具有全局与局部化学习的特点,同时也为诸多的SVM算法嵌入数据内在结构这类先验信息提供了统一框架。为进一步提高运算效率,在SOCSVM二次规划求解基础上,通过最小化目标数据均值到超平面的函数距离,提出了线性规划算法,同时也避免了SOCSVM必须以原点作为负类代表的不足。人工和真实数据集上的实验结果验证了嵌入目标数据结构信息的SOCSVM及其线性规划算法的有效性。
In order to distinguish the target class from outliers accurately, One-Class Classifier (OCC) should take into account the prior knowledge of the target class. However, One-Class SVM ( OCSVM), the state-of-the-art OCC, neglects the data's distribution information while finding the optimal hyperplane. Structured OCSVM (SOCSVM), the novel proposed OCC, alleviates this problem by embedding the within-class scattered matrix of the target data into OCSVM. As a result, SOCSVM not only overcomes the above disadvantage of the OCSVM, but also provides a unified framework for the present SVM algorithms how to consider intrinsic structure of the data. Moreover, to improve the efficiency of SOCSVM, linear programming algorithm called SlpOCSVM is proposed to instead of the quadratic programming solving for SOCSVM. Through minimizing the functional distance of the data's mean to the hyperplane, the optimal hyperplane is attracted automatically to the place of the minimum positive half space without borrowing the origin as a representative of the outlier anymore . The experiment results on toy problem and real data sets demonstrate the advantage of SOCSVM and its linear programming algorithm.