为研究动态异构信息网络划分问题,利用异构信息网络的稀疏性,提出一种解决星型模式的动态异构信息网络的演化聚类算法。首先从相容的角度将异构信息网络转化为若干个相容的二部图,并构造时间平滑二部图,使其能够表达某时刻及先前时间结点间的关系;然后由随机映射和一种线性时间的求解程序快速计算出每个时间平滑二部图的近似commute time嵌入,获得指示目标数据集的多个指示子集;最后计算指示同一个目标对象的所有指示数据与标号相同的类的中心点加权距离总和,由k?means方法确定目标对象所属的类。经验证,该算法划分动态异构信息网络的准确率较高,计算速度较快。
In order to cluster dynamic heterogeneous information networks, a fast evolutionary clustering algorithm for dynamic heterogeneous information networks with star schema is proposed in this paper by using the sparsity of heterogeneous information networks. First, the heterogeneous information network is transformed into multiple com-patible bipartite graphs from the point of view of compatibility and a temporal smoothing bipartite graph is construc-ted so that it can represent the relation between the nodes at a time and the time before it. Next, the approximate commute time embedding for each temporal smoothing bipartite graph is computed via random mapping and a linear time solver, thereby the multiple embedding subsets for target dataset are obtained. Finally, the sum of the weighted distances is computed by using all the indicators in embedding subsets to indicate the identical object and all the centers of the clusters with identical label. The clusters of the heterogeneous information network can be acquired by k-means. This proposed algorithm is validated with higher accuracy rate and faster computation speed in dividing dynamic heterogeneous information networks.