离散视角下,函数型自适应权重聚类的有效性取决于基函数的最优选择,目前尚无客观统一准则。基于随机过程的Karhunen-Loeve展开定理,本文对函数型自适应权重聚类分析进行了连续视角的进一步拓展。相对现有同类函数型数据聚类分析,拓展模型的核心优势在于:(1)基于Karhunen-Loeve展开实现了函数空间向多元统计空间的过渡,避免了人为选择基函数的主观任意性;(2)依据变量重要程度重构自适应权重距离为函数之间的相似性测度,并有充分的理论基础保证其必要性、合理性;(3)在充分保留原始数据信息的前提下,能够应用经典的有限维多元分析方法解决无限维的函数型聚类问题。实证检验表明,新模型能够降低聚类过程的计算成本,显著提升分类正确率、稳健性和普遍适用性。
The validity of functional adaptive weight clustering is determined by the optimal choice of the basis functions, and there is no objective uniform criterion at present. From a continuous perspective, this paper presents a deeper extension of functional adaptive weighting clustering based on Karhunen- Loeve expansion of stochastic process. Compared with the existing functional data clustering analysis, the core advantages of the extended model are as follows: (1) Mapping infinite functional space to multivariate statistical space based on K.L. expansion and avoid the subjective arbitrary in selecting basis functions; (2) Reconstructing a adaptive weighting distance as functional data clustering statistics according to random variables' importance with sufficient theoretical guarantee; (3) Under the premise of fully preserving the original data information, solving the problem of infinite dimensional functional clustering by classical finite dimensional multivariate analysis method. Empirical test result reveals that the new model can not only reduce the computational cost of clustering process, but also improves correct classification rate, stability and universal adaptability.