针对不确定数据集中,高期望权值容易产生较大计算冗余的问题,提出一种两阶段高期望权重项集下闭合类Apriori挖掘算法。针对传统层次挖掘算法只采用项集上界,对高期望权值进行处理效果并不理想的问题,设计补充层次挖掘算法的下闭合特性,给出证明过程,该特性在保证精度前提下可有效降低候选项集处理量;构建两阶段数据挖掘过程,第一阶段,基于分层搜索方法获得一组候选项集的高期望权重项集,第二阶段,再次扫描数据库获得项集高期望权值,完成数据挖掘过程。在标准数据集上的仿真对比结果表明,该算法在保证算法精度的前提下,能够大幅提高算法计算效率。
To solve the problem of large computation redundancy due to the high expectation weight in the uncertain data set,the under closed class Apriori mining algorithm based on two stage high expectation weight was proposed.Because the upper bound in the traditional level mining algorithm is not ideal for high expectation weight,the level mining algorithm with the under closed characteristics was designed,and its proof process was given.The two stage based data mining process was constructed.In the first stage,the hierarchical search method was used to obtain the high expectation weight of a set of candidate set.In the second stage,the database was scanned once again to obtain a high expected weight,so as to complete the data mining process.Comparing the simulation results with the standard data set,the proposed algorithm can greatly improve the computational efficiency of the proposed algorithm in the premise of ensuring the accuracy of the algorithm.