东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

基于样本条件价值改进的Co—training算法

ISSN号：0254-4156
期刊名称：《自动化学报》
时间：0
分类：TP[自动化与计算机技术]
作者机构：[1]哈尔滨工业大学计算机科学与技术学院,哈尔滨150001
相关基金：国家自然科学基金（61173087,61073128）,黑龙江省自然科学基金（F201021）资助

关键词：机器学习, 半监督学习, CO-TRAINING, 富信息样本, 条件价值, Machine learning, semi-supervised learning, co-training, informative example, conditional value

中文摘要：

Co-training是一种主流的半监督学习算法．该算法中两视图下的分类器通过迭代的方式，互为对方从无标记样本集中挑选新增样本，以更新对方训练集．Co-training以分类器的后验概率输出作为新增样本的挑选策略，该策略忽略了样本对于当前分类器的价值．针对该问题，本文提出一种改进的Co．training式算法-CVCOT（Conditionalvalue-basedco-trainingl，即采用基于样本条件价值的挑选策略来优化Co．training．通过定义无标记样本的条件价值，各视图下的分类器以样本条件价值为依据来挑选新增样本，以此更新训练集．该策略既可保证新增样本的标记可靠性，又能优先将价值较高的富信息样本补充到训练集中，可以有效地优化分类器．在UCI数据集和网页分类应用上的实验结果表明：CVCOT具有较好的分类性能和学习效率．

英文摘要：

Co-training is one of the major semi-supervised learning methods, which iteratively trains two classifiers under two different views, and uses the predictions of either classifier on the unlabeled examples to augment the training set of the other. In each round of co-training, newly added examples are selected according to the classifier＇s posteriori probability output, which neglects examples~ value with respect to the current classifier. This paper proposes an improved co-training style algorithm, termed as CVCOT （conditional value-based co-training）, which employs a conditional value- based strategy for selecting candidate training examples. Specifically, the conditional value of unlabeled examples in the co-training process is defined and computed, then it is utilized by either classifier under different views for augmenting the training set of the other. The new strategy can not only guarantee the reliability of the pseudo-labels, but also tends to add more informative examples with higher values to the training sets. Therefore, the classifier under either view will get refined. Experiments on UCI data sets and application to the web page classification task indicate that the CVCOT achieves better classification performance and learning efficiency.

同期刊论文项目

弱监督在线学习方法及其在视觉目标跟踪中的应用

期刊论文 4

基于Constellation模型的自然场景文本检索方法研究

期刊论文 12 会议论文 4

同项目期刊论文

一种语义级文本协同图像识别方法

一种改进的ML-kNN多标记文档分类方法

多模态特征联合稀疏表示的视频目标跟踪

基于随机投影的场景文本图像聚类方法研究

基于样本条件价值改进的Co-training算法

A Novel Inductive Semi-supervised SVM with Graph-based Self-Training

Combining example selection with instance selection to speed up multiple-instance learning

局部特征与多示例学习结合的超声图像分类方法

CoSTra: Confidence-based Self-training

基于局部加权的Citation-kNN算法

一种改进的ML-kNN多标记文档分类方法

基于谱聚类的改进的文本图像分割方法

期刊信息

《自动化学报》
中国科技核心期刊

主管单位:中国科学院
主办单位:中国自动化学会中国科学院自动化研究所
主编：王飞跃
地址：北京东黄城根北街16号
邮编：100717
邮箱：aas@ia.ac.cn
电话：010-64019820

国际标准刊号：ISSN：0254-4156
国内统一刊号：ISSN：11-2109/TP
邮发代号:2-180

获奖情况:
1997年获全国优秀期刊奖,1985、1990、1996、2000年获中国科学院优秀期刊二等奖,2002年获国家期刊奖

国内外数据库收录:
美国数学评论（网络版）,德国数学文摘,荷兰文摘与引文数据库,美国工程索引,日本日本科学技术振兴机构数据库,中国中国科技核心期刊,中国北大核心期刊（2004版）,中国北大核心期刊（2008版）,中国北大核心期刊（2011版）,中国北大核心期刊（2014版）,中国北大核心期刊（2000版）

被引量:27550