针对正态分布的样本数据粗差判别,MCD(Minimum Covariance Determinant Estimator)方法存在算法参数h难以确定的缺点,提出了基于MCD的改进算法Modified MCD(M-MCD)。该算法首先以基于MCD估计的鲁棒马氏距离平方的标准差与理论总体样本马氏距离平方的标准差的最小偏差为目标,通过自适应迭代,求得最佳的算法参数h。然后,在最佳算法参数h下,基于MCD估计的鲁棒马氏距离,通过卡平方分布判别样本数据中的粗差。系列仿真实验表明:MCD方法的粗差判别结果严重依赖于算法参数h;M—MCD方法能通过自适应迭代求得最佳算法参数h,并具有良好的粗差判别性能,且优于MCD。
In this paper, an improved algorithm, Modified-MCD(M-MCD), is proposed for aiming at the problem that the h parameter of MCD is hard to determine when detecting outliers from the normal distributed samples. M-MCD will forstly minimize the deflection between the standard deviation of robust mahalanobis squared distance and the standard deviation of theoretical mahalanobis squared distance by means of self adaptive iteration so as to determine the optimal parameter h of MCD. Then it detects the outliers from the robust mahalanobis-distance based on MCD with the optimal parameters h by means of Chi-squared distribution. Simulated experiments show that the results of outliers detection based on MCD rely much more on parameter h; M-MCD can get the optimal parameter h through self-adaptive iteration and performs better on outliers detection than MCD.