大量的微博广告影响了微博数据分析模型的使用.针对微博广告文本识别问题,利用基于图的半监督的标签传播算法,指导计算机从大量的非结构化的微博文本中自动识别出微博广告.通过对实验数据的评测,结果显示,当已有标签样本较少时,基于图的半监督的标签传播算法能够获得比有监督的支持向量机和朴素贝叶斯算法更好的性能.
Many advertisements in micro-blog affected the use of micro-blog data analysis models.Aiming at implementing microblog advertisement text recognition,this paper investigates a graph-based semi-supervised learning algorithm,that is,the label propagation,to recognize micro-blog advertisement from a large number of micro-blog texts.Experimental results on the large-scale data shows that this method achieves a better performance than supervised learning algorithm,such as support vector machine and naive Bayes,do when only very few labeled examples are available.