当前的部分多标签分类算法本质上由两项分类技术级联而成,前一级建立标签排序系统,后一级检测相关标签,兼顾进一步改善分类性能.本文针对不同标签检测技术开展研究,收集并实现4种通用标签检测技术:线性回归阈值法、多输出线性回归法、Logistic回归法以及离散Bayes规则,以k近邻算法作为基线算法,在10个基准数据集上进行实验比较.实验结果表明,从计算时间与分类性能两个方面来说,多输出线性回归法是值得推荐的方法.
Now some multi-label classification methods cascade two different classification techniques in essence. The former is to build a label ranking system, and the latter to detect relevant labels effectively and improve classification performance further. To compare the different detection techniques, we collect four general label detection approaches : linear regression threshold, multiple output linear regression, logistic regression and discrete Bayesian methods. With k-nearest neighbor algorithm as a baseline method,we conduct an extensive experimental comparison on ten benchmark data sets. Our experimental results demonstrate that multiple output linear regression technique is recommendable, according to both computational time and classification performance.