随机块模型可以生成各种不同结构(称作广义社区,包括传统社区、二分结构、层次结构等)的网络,也可以根据概率对等原则发现网络中的广义社区.但简单的随机块模型在网络生成过程建模和模型学习方面存在许多问题,导致不能很好地发现实际网络的结构,其扩展模型GSB(generalstochasticblock)基于链接社区思想发现广义社区,但时间复杂度限制其在中大型规模网络中的应用.为了在无任何先验的情形下探索不同规模网络的潜在结构,基于GSB模型设计一种快速算法FGSB,更快地发现网络的广义社区.FGSB在迭代过程中动态学习网络结构参数,将GSB模型的参数重新组织,减少不必要的参数,降低算法的存储空间;对收敛节点和边的参数进行裁剪减少每次迭代的相关计算,节省算法的运行时间.FGSB与GSB模型求解算法有相同的结构发现能力,但FGSB耗费的存储空间和运行时间比GSB模型求解算法要低.在不同规模的人工网络和实际网络上验证得出:在近似相同的准确率下,FGSB比GSB模型求解算法快,且可发现大型网络的广义社区.
A stochastic block model can produce a wide variety of networks with different structures (named as general community, including traditional community, bipartite structure, hierarchical structure and etc); it also can detect general community in networks according to the rules of stochastic equivalence. However, the simple stochastic block model has some problems in modeling the generation of the networks and learning the models, showing poor results in fitting the practical networks. The GSB (general stochastic block) model is an extension of the stochastic block model, which is based on the idea of link community and is provided to detect general communities. But its complexity limits its applications in medium and large networks. In order to explore the latent structures of networks with different scales without prior knowledge about networks, a fast algorithm on the GSB model (FGSB) is designed to explore general communities in networks faster. FGSB dynamically learns the parameters related to the network structure in the process of iterations. It reduces the storage memory by reorganizing parameters to cut down unnecessary parameters, and saves the running time by pruning the related parameters of converging nodes and edges to decrease the computing time of each iteration. FGSB has the same ability of structure detection as the GSB model, but its complexities of time and storage are lower. Tests on synthetic benchmarks and real-world networks have demonstrated that FGSB not only can run faster than the algorithm of the GSB model in the similar accuracy, but also can detect general communities for large networks.