由于在频繁项集和频繁序列上取得的成功,数据挖掘技术正在着手解决结构化模式挖掘问题一一频繁子图挖掘.诸如化学、生物学、计算机网络和www等应用技术都需要挖掘此类模式.提出了一种频繁子图挖掘的新算法.该算法通过对频繁子树的扩展,避免了图挖掘过程中高代价的计算过程.目前最好的频繁子图挖掘算法的时间复杂性是O(n^3·2^n),其中,n是图集中的频繁边数.提出算法的时间复杂性是O(2^n n^25/logn).性能提高了D(√n.logn)倍.实验结果也证实了这一理论分析.
With the successful development of frequent item set and frequent sequence mining, the technology of data mining is natural to extend its way to solve the problem of structural pattern mining--Frequent subgraph mining. Frequent patterns are meaningful in many applications such as chemistry, biology, computer networks, and World-Wide Web. This paper proposes a new algorithm GraphGen for mining frequent subgraphs. GraphGen reduces the mining complexity through the extension of frequent subtree. For the best algorithm available, the complexity is O(n^3·2n), n is the number of frequent edges in a graph dataset. The complexity of GraphGen is O(2^n n^25/logn),which is improved O(/n- logn) times than the best one. Experimental results prove this theoretical analysis.