位置:成果数据库 > 期刊 > 期刊详情页
典型半监督分类算法的研究分析
  • ISSN号:1673-629X
  • 期刊名称:《计算机技术与发展》
  • 时间:0
  • 分类:TP301.6[自动化与计算机技术—计算机系统结构;自动化与计算机技术—计算机科学与技术]
  • 作者机构:南京邮电大学计算机学院、软件学院,江苏南京210000
  • 相关基金:国家自然科学基金资助项目(61300165);高等学校博士学科点专项科研基金新教师类(20133223120009);南京邮电大学引进人才基金(NY213033)
作者: 孟岩, 汪云云
中文摘要:

近年来,大量半监督分类算法被提出。然而在真实的学习任务中,研究者很难决定究竟选择哪一种半监督分类算法,而在这方面并没有任何指导。半监督分类算法可通过数据分布假设进行分类。为此,在对比分析采用不同假设的半监督分类典型算法的基础上,以最小二乘方法(Least Squares,LS)为基准,研究比较了基于聚类假设的转导支持向量机(Transductive Support Vector Machine, TSVM)和基于流行假设的正则化最小二乘法(Laplacian Regularized Least Squares Classification,LapRLSC),并同时利用两种假设的SemiBoost以及无任何假设的蕴含限制最小二乘法( Implicitly Constrained Least Squares,ICLS)的分类效果。得出的结论为,在已知数据样本分布的情况下,利用相应假设的方法可保证较高的分类正确率;在对数据分布没有任何先验知识且样本数量有限的情况下,TSVM能够达到较高的分类精度;在较难获得样本标记而又强调分类安全性时,宜选择ICLS,而LapRLSC也是较好的选项之一。

英文摘要:

Large amounts of semi-supervised classification algorithms have been proposed recently, however, it is really hard to decide which one to use in real learning tasks, and further there is no related guidance in literature. Therefore, empirical comparisons of several typical algorithms have been performed to provide some useful suggestions. In fact, semi-supervised classification algorithms can be cate- gorized by the data distribution assumption. Therefore, typical algorithms with different assumption adoptions have been contrasted. Spe- cifically, they are Transductive Support Vector Machine (TSVM) using the cluster assumption, Laplacian Regularized Least Squares Classification (LapRLSC) using the manifold assumption, and SemiBoost using both assumptions, and Implicitly Constrained Least Squares (ICLS) without any assumption, with the supervised least Squares Classification (LS) as the base line. Eventually it is conclu- ded that when data distribution is given, the semi-supervised classification algorithm that adopts corresponding assumption can lead to the best performance;without any prior knowledge about data distribution, TSVM can be a good choice when the given labeled samples are extremely limited; when the labeled samples are not so scarce, and meanwhile if learning safety is emphasized, ICLS is proposed, and La- pRLSC is another good choice.

同期刊论文项目
同项目期刊论文
期刊信息
  • 《计算机技术与发展》
  • 中国科技核心期刊
  • 主管单位:陕西省工业和信息化厅
  • 主办单位:陕西省计算机学会
  • 主编:王守智
  • 地址:西安市雁塔路南段99号
  • 邮编:710054
  • 邮箱:ctad@vip.163.com
  • 电话:029-85522163
  • 国际标准刊号:ISSN:1673-629X
  • 国内统一刊号:ISSN:61-1450/TP
  • 邮发代号:52-127
  • 获奖情况:
  • 《CAJ-CD规范》执行优秀期刊
  • 国内外数据库收录:
  • 中国中国科技核心期刊
  • 被引量:21263