经典模糊积分只是将高维空间的数据沿着线性被积函数决定的直线投影到一维空间,无法覆盖现实问题中不规则分布的数据。提出一种新的模糊积分扩展形式——基于高斯函数的模糊积分(高斯模糊积分)。由于高斯函数的分布曲线趋于正态分布,将其作为被积函数,能更大范围地覆盖数据,并基于此构建分类模型。在实验部分将新的分类模型应用到几个经典数据库,以验证扩展后的性能,结果表明基于高斯函数的模糊积分能更好地发挥模糊积分的特性,分类效果明显优于传统模糊积分。最后将其应用到疑似乙肝病毒基因数据库进行疾病诊断。所有病例来自威尔士医院,包括真正病人和疑似患者两类。笔者试图通过高斯模糊积分来进一步诊断病人的真实情况,结果表明高斯模糊积分具有较高的测试敏感度,这一指标是医学研究者最为关心的,即尽量保证不错过一个可能病例。
The classical fuzzy integral projects the data from high dimensional space into one dimensional space along a group of straight lines. In reality, the classical projection lines cannot cover the data with irregular distribution. This paper proposed a new fuzzy integral using Gaussian function as integrand which was called as Gaussian fuzzy integral (GFI). The projection with GFI could cover the most data along the Gaussian curves. It constructed a new classifier based on the Gaussian fuzzy inte- gral and applied it to several benchmark datasets for testifying the performance. The results show that GFI can work better with FI' s characteristics and has better classification accuracy than classical FI. Finally, used GFI to classify the hepatitis B virus (HBV) gene data for cancer diagnosing. It selected all cases from the Wales Hospital of Hong Kong which included real can- cer patients and uncertain ones. It tried to discern these cases clearly. The results show that GFI has optimal testing sensitivity for diagnosis. This index is more important for medicals than accuracy, which means it doesn' t hope to miss any real patients as more as possible.