为了对图数据库中的结构化数据有效的聚类分析,首先对不同的图数据样本进行特征的深度挖掘,构造了包含节点间连接层次关系的关联度矩阵,与拉普拉斯矩阵结合共同完成谱特征分析;然后利用高斯核函数进行相似度矩阵的构建,将相似度归一化到0到1的范围内便于后期处理;最后结合图分割与k-means算法将相似度矩阵进行k分割,得到k个聚类。经过大量分析实验表明,改进的拉普拉斯矩阵对样本内部结构有更为精细的划分,提高了前期样本处理效果。最小比率割算法在保证精度的前提下,将NP难的问题转化为多项式时间内解决的问题,提高了算法的效率。
In order to analyze the structured data in database with valid clustering,the algorithm firstly mines the depthdata characteristics of different graph sample,and constructs the association matrix including connection relations andnodes with hierarchy,completes the analysis of spectral characteristics combined with the Laplace matrix.Secondly,it usesthe Gaussian kernel to build the similarity matrix,to facilitate post-processing the value of similarity matrix normalized inthe range of0-1.Finally,it combines k-means with graph partitioning algorithm to make the data k-partition,then gets kcluster of the database.The experimental results demonstrate that the improved Laplace has finer division of the internalstructure in matrix,and improves the pre-process results of the sample.The minimum rate cut algorithm ensures the accuracyof the premise,and turns the NP-hard problem into a polynomial time to solve the problems and improve algorithmefficiency.