位置:成果数据库 > 期刊 > 期刊详情页
截断式鲁棒非负矩阵分解算法
  • ISSN号:0469-5097
  • 期刊名称:《南京大学学报:自然科学版》
  • 时间:0
  • 分类:TP181[自动化与计算机技术—控制科学与工程;自动化与计算机技术—控制理论与控制工程]
  • 作者机构:北京交通大学计算机与信息技术学院,北京100044
  • 相关基金:国家自然科学基金(61370129,61375062); 中央高校基本科研业务费(2014JBM029); CCF-Tencent RAGR(20150116); 教育部-中国移动科研基金(MCM20513)
中文摘要:

非负矩阵分解算法(Nonnegative Matrix Factorization Algorithm,NMF)已经广泛地应用于诸多领域,但它容易受到异常点的影响.各种针对这个问题的改进方法中,使用L2,1范数的鲁棒非负矩阵算法(Robust Nonnegative Matrix Factorization Algorithm,RNMF)取得了较好的改进效果,但是该算法不能很好的适应数据集异常点比例的变化.针对这一缺点,提出了截断式鲁棒非负矩阵分解算法(Capped Robust Nonnegative Matrix Factorization Algorithm,CRNMF),将去噪比例ε值引入到目标函数中,降低异常点对整体算法的影响.该算法的主要步骤是:在矩阵分解迭代更新的每一步中,计算输入数据与分解因子重构值之间的误差,将误差大于预先设定参数值ε的数据点对应的误差截断为零,重复以上步骤直到收敛.通过ε截断操作,降低基矩阵F和系数矩阵G受异常点的影响.给出了CRNMF的算法描述,并且在模拟数据集和真实数据集进行了实验,实验表明提出的算法与传统的NMF和RNMF相比,可以在一定程度上提高聚类的准确度,减少了异常点对聚类准确度的影响,提高了算法的鲁棒性.

英文摘要:

Nonnegative Matrix Factorization Algorithm(NMF)has been widely applied in various areas,but it is easily influenced by outliers.In order to solve this problem,researchers have proposed Robust Nonnegative Matrix Factorization Algorithm(RNMF),which uses L2,1norm to make the normal points be approximate as much as possible and reduce the residual of the outliers by using its absolute rather than square.However,RNMF is still sensitive to the proportion of outliers,i.e.,in some datasets,RNMF can handle outliers well,but its performance in other datasets is not satisfactory.Every real dataset has its own structure,that is,it contains a different proportion of outliers.Because of this,RNMF is limited in practical application.In this article,we present a Capped Robust Nonnegative Matrix Factorization Algorithm(CRNMF)by adding a denoising rateεinto the objective function of RNMF.To achieve better controlling of the outliers,we use this algorithm to handle the situation which the real dataset outlier ratio is differ-ent.The main idea of CRNMF is evaluating the residual for each data point according to the input data and the factors during the iterative procedure,if the residual is larger than the given denoising rateε,we will set the residual as0,i.e.,the corresponding data point is taken as outlier and not considered in the computing process.By introducingεtruncation,the algorithm reduced the influence of outliers on matrix Fand matrix G.This paper gives the description of CRNMF and experiments on real world and synthetic data sets.Experimental results show that the proposed algorithm can improve the clustering accuracy,reduce the impact of outliers and then improve the robustness of the algorithm to some extent,compared with the traditional NMF and RNMF.

同期刊论文项目
同项目期刊论文
期刊信息
  • 《南京大学学报:自然科学版》
  • 中国科技核心期刊
  • 主管单位:中华人民共和国教育部
  • 主办单位:南京大学
  • 主编:龚昌德
  • 地址:南京汉口路22号南京大学(自然科学版)编辑部
  • 邮编:210093
  • 邮箱:xbnse@netra.nju.edu.cn
  • 电话:025-83592704
  • 国际标准刊号:ISSN:0469-5097
  • 国内统一刊号:ISSN:32-1169/N
  • 邮发代号:28-25
  • 获奖情况:
  • 中国自然科学核心期刊,中国期刊方阵“双效”期刊
  • 国内外数据库收录:
  • 美国化学文摘(网络版),美国数学评论(网络版),德国数学文摘,中国中国科技核心期刊,中国北大核心期刊(2004版),中国北大核心期刊(2008版),中国北大核心期刊(2011版),中国北大核心期刊(2014版),中国北大核心期刊(2000版)
  • 被引量:9316