针对利用核主成分分析方法处理非线性问题存在对干扰点的敏感性和特征空间中的主成分缺乏明确的物理意义等缺点,提出了一种改进的模糊KPCA(ImprovedFuzzyKernelPrincipalComponentAnalysis,IFKPCA)算法,对每个样本点进行加权处理,并利用基于距离的特征核函数和径向基核函数,把特征空间中的重构误差和输入空间的误差对应起来。用算法对2个无干扰和有干扰的数据集进行了仿真实验。同时,对药物代谢的数据进行主成分提取。结果表明,IFKPCA弱化了干扰点对样本分布的影响,表现出较好的鲁棒性;基于距离的特征核函数对样本分布具有较大的依赖性,而径向基核函数对样本分布具有良好的鲁棒性,对药物代谢的应用结果也进一步表明了IFKPCA的有效性和可行性。
As to the nonlinear problems, the kernel principal component analysis (KPCA)is sensitive to interference point. Principal component in feature space lacks a clear physical meaning. An improved fuzzy KPCA(IFKPCA) algorithm has been proposed. By using distance-based kernel function and radial basis function, the reconstruction error in feature space and input space has linked. Two ex- periments were carried out in non-interference and interference data sets. Meanwhile, it is also carried on drug metabolism data princi- pal component extraction. The results show that, IFKPCA weakened interference points on the sample distribution, showing robust; and distance-based kernel function has dependence on the distribution of data set, while radial basis function has good robust to the dis- tribution of data set . And application of drug metabolism results further show the effectiveness and feasibility of IFKPCA.