传统Aprior频繁子图挖掘算法中存在大量冗余子图。针对该问题,提出一种新的频繁子图挖掘算法(GAI)。介绍一种三层MADI索引结构,用于存储图集的信息,以减少图集的扫描次数,通过扩展ETree树构造频繁子图,并用表来存储候选子图,避免扩展过程中冗余图的产生以及对整个数据库的扫描,从而简化支持度的计算,提高图/子图同构的查询效率。实验结果表明,与Aprior算法相比,GAI的挖掘效率更高。
In order to resolve the problem of traditional Apriori algorithm that exists redundancy subgraphs when mining frequent subgraph,a new frequent subgraph mining algorithm called GAI is proposed.To reduce the number of scanning database,MADI index structure of three levels is proposed to store the information of graphs.It uses the expansion of the ETree to construct the frequent graph,and uses tables to store candidate subgraphs.It is avoided the redundancy subgraphs in expansion processing and scanning the entire database.It greatly simplifies the calculation of support degree and improves the query efficiency of graph isomorphism and subgraph isomorphism.Experimental results show GAI has the higher mining efficiency than Apriori algorithm.