该文研究了从不确定图上挖掘top-k稠密子图的问题.由于图数据具有内生不确定性,确定图上稠密子图的定义和挖掘算法在不确定图上均不适用.因此,该文提出了不确定图上期望稠密度的概念,并给出了其在多项式时间内的计算方法.基于此,该文定义了不确定图中导出子图之间的一种偏序关系.利用该偏序关系,将不确定图中的导出子图有效地组织成一棵搜索树.该文严格证明了此搜索树中可以完整无重复地覆盖不确定图上的所有导出子图.据此,该文提出了针对此搜索树的一种分支界限搜索算法DS,用于精确挖掘top-k稠密子图.该文还提出了不相交top-k稠密子图的概念,并给出了一种基于束搜索的启发式近似搜索算法LS.在多组数据集上的实验结果表明,文中提出的DS算法具有很高的效率和很好的扩展性,可用于处理大规模图数据.启发式近似搜索算法LS可以快速发现不相交top-k稠密子图.
This paper investigates the problem that mining top-kdense subgraphs from uncertain graphs.Since uncertainties are inherent in graph data,traditional concepts and algorithms on mining dense subgraphs are not applicable to uncertain graphs.Hence,this paper firstly purposes the expected density concept and show the computing method to compute it in polynomial time.Based on this definition,we define a partial order on all induced subgraphs in uncertain graphs.Through this partial order,all induced subgraphs are organized to be an enumeration tree.It's carefully proved that each induced subgraph will occur in this enumeration tree exactly once.We give out a branch and bound search algorithm DS on this tree that produces top-k dense subgraphs.Meanwhile,we purpose the definition of disjoint top-k dense subgraphs and show a heuristic approximation algorithm LS based on beam search.Extensive experiments on multiple datasets indicates that the DS algorithm are both efficient and scalable,which can be used to process large graph data.The approximation algorithm LS holds an excellent performance both in efficiency and approximate quality.