高阶联合聚类一般被转化为多对二阶联合聚类结果的一致融合问题,将多个二阶聚类目标函数的加权线性组合作为高阶联合聚类的目标函数,通过交替迭代方法得到聚类结果.然而,现有算法仍根据专家经验预设权值,自动的确定线性组合的最优权值仍是一个经典难题.文中针对星型高阶异构数据,提出一种基于理想点的自动确定权值的一致融合策略,将各二阶聚类目标函数的最优值构成的空间中的点称为理想点.通过将二阶聚类结果与其理想结果间的相对距离作为聚类质量的度量标准,解决了各二阶聚类质量不可公度的问题,最终使得高阶聚类目标函数与理想点的相对距离最小.基于理想点的方法能够解决多种星型高阶联合聚类算法的一致融合问题,因此具有一定的普适性.实验结果表明该方法有效地提高了5种经典高阶聚类算法的效果.
The problem of high-order co-clustering is converted in to the problem of consistent ensemble of multiple pair two-order co-clustering.Clustering results are obtained by an alternate iterative method,which is used to optimize a weighted combination of objective functions of each pair of two-order co-clustering.However,existing algorithms set the weights according to artificial expert expertise.So far how to automatically determine the optimal weights is a classic problem.Based on ideal point which is the point in space that composed of optimal value of each two-order co-clustering objective function,a strategy of consistent ensemble which can automatically determine the weights is developed for star-structure high-order heterogeneous data.By taking the relative distance between two-order co-clustering results and ideal results as criterion,we solve the problem of incommensurability,and finally minimum the relative distance between high-order co-clustering objective function value and ideal point.Because the strategy based on ideal pointcan solve the problem of consistent ensemble of multiple algorithms of high-order co-clustering,it is a general method.Experimental results show that the method can improve the clustering effect of five algorithms of high-order co-clustering.