针对现有PCA方法在大数据降维过程中数据处理速度过慢这一问题,设计并实现了一种基于曲线拟合技术的方差贡献率函数拟合方法,并将其应用于植物叶片的本征维数估计之中。为了提高本征维数估计的精度,提出了一种"粗略估计本征维数区间+精确判断"相结合的本征维数估计方法。为了验证算法的有效性,利用5类植物叶片共计150个样本进行了识别测试。试验结果表明,文中方法可以得到与PCA方法相近的分类效果,但识别时间要远小于PCA方法,表明将该方法应用于高维数据集的本征维数估计是有效的、可行的。
Common used PCA method has the disadvantage of low speed in the processing of high dimensionality reduction areas. In order to solve the problem, an improved PCA method based on curve fitting algorithm of variance contribution data is presented and it is used in the classification of leaves. In order to promote the precision of intrinsic dimension estimation, an algorithm named "rough estimation of di.oension range and accurate measurement" combined method is used in this paper. To testify the performance of the pre~ented method in this paper, 5 kinds of leaves such as Ginkgo leaves, Ligustrun lucidum leaves, Acer monoes leaves, ~VT'nter sweet leaves and Diospyros lotus leaves are used in this experiment and the total sample number is 150. Experim, ~1 results show that the proposed method could get similar classification result while the time of recognition is much ~, ".. iler than the common used PCA method, which shows that the proposed method is effective and feasible in the use of high dimensionality reduction.