对于工业过程数据中的离群点,一般采用稳健估计技术处理.针对Fast-MCD算法中初值随机给定,以及当样本数据较大时,人为给定分堆个数的缺点,提出了一种基于模糊聚类的改进稳健估计算法,即采用聚类中心及聚类个数分别作为Fast-MCD算法的初值及分堆个数选择依据,从而提高计算效率,并使样本数据较大时的分堆计算更合理.将本方法用于分析铝酸钠溶液的温度电导建模数据,实现了离群点的辨识,可以消除不规则数据对软测量建模的不合理影响.与Fast-MCD方法相比,它收敛速度快,计算效率高.
Robust estimation is usually used for dealing with the outliers in the industry process data.A new robust estimation method is proposed for improving the Fast-MCD algorithm which has random starting value and artificial value of subsection.Fuzzy clustering is adopted to improve the computing efficiency in this method and the clustering center and clustering number are used to replace the starting value and subsection value.This method is implemented to analyze the temperature and conductivity data of sodium aluminate solution,and the simulation results show the proposed method can realize the identification of outliers.It also can reduce the unreasonable influence of outliers to soft sensing.Compared to Fast-MCD,it has the merits such as rapid convergence and high efficiency.