针对频繁模式和已有的相关模式不能完全去除交叉支持可疑模式和包含负相关商品项的可疑模式的问题,提出了关联且项项正相关频繁模式挖掘的新问题及其解决方案.阐述了一种新颖的all-item-confidence相关兴趣度量,探讨了该度量所具有的合适的上下界、反单调性等性质.选取all—item-confidence描述模式的项项正相关性,从而有效过滤包含负相关商品项的可疑模式;同时采用all-confidence描述模式的关联性,去除交叉支持可疑模式.进一步给出相关定义,提出两种挖掘算法:ItemCoMine_AP和ItemCoMine_CT,并对算法性能、度量减枝效果、实际零售数据集应用效果进行了测试.实验结果表明,两种算法执行性能良好,all-confidence和all—item-confidence对可疑模式有明显的减枝效果,挖掘得到的关联且项项正相关模式具有较好的应用价值.
Frequent patterns mining and current correlated patterns mining cannot completely wipe off the suspicious cross-support patterns and the patterns containing two negative-correlated items. A new problem of mining associated and item-item correlated frequent patterns and its solution were proposed. A new correlated interest measure named all-item-confidence was presented, and its properties such as proper upper bound and lower bound, anti-monotone property were discussed. All-item-confidence was chosen to describe pattern's item-item correlation, thus the patterns which contain two negative-correlated items can be filtered. Meanwhile, all-confidence was used to describe pattern's association, and the suspicious crosssupport patterns can be eliminated. Then the correlated definitions were given, and two mining algorithms, ItemCoMine_AP and ItemCoMine_CT, were presented. The performance of these two algorithms, the pruning capability of measures, and their practical effect in real retail dataset were also tested. These two algorithms perform well, all-confidence and all-item-confidence have the good pruning effect on eliminating suspicious patterns, and associated and item-item correlated frequent patterns have the good application value.