位置:成果数据库 > 期刊 > 期刊详情页
Improved Pattern Tree for Incremental Frequent-Pattern Mining
  • ISSN号:1006-4982
  • 期刊名称:《天津大学学报:英文版》
  • 时间:0
  • 分类:TP311.13[自动化与计算机技术—计算机软件与理论;自动化与计算机技术—计算机科学与技术] TP311.12[自动化与计算机技术—计算机软件与理论;自动化与计算机技术—计算机科学与技术]
  • 作者机构:[1]School of Mechanical Engineering, Tianjin University, Tianjin 300072, China
  • 相关基金:Supported by National Natural Science Foundation of China (No.50975193) and Specialized Research Fund for Doctoral Program of Higher Education of China (No.20060056016).
中文摘要:

由分析存在前缀树数据结构,一棵改进模式树为处理新交易被介绍。它第一在一棵词典的顺序树上存储了交易然后由在一份下降频率的订单排序每条路径重构树。当更新改进模式树时,到没有需要重新扫描全部新数据库或重建为增长更新的一棵新树。测试与 100,000 宗交易和 870 个项目在合成数据集 T10I4D100K 上被执行。试验性的结果看那越小最小的支持阀值,改进模式树为所有数据集在 CanTree 上完成越多 faster。当最小的支持阀值从 2% ~ 3.5% 增加了,运行时刻从 452.71 s 减少了到 186.26 s。同时, CanTree 要求的运行时刻从 1,367.03 s 减少了到 432.19 s。当数据库被更新时,改进模式树的执行时间由原来的改进模式树和起始的树的重建的建设组成了。实验结果证明运行时刻被大约 15% 与 CanTree 的相比节省。当交易的数字增加了,改进模式树的运行时刻比 FP 树的突然是大约 25% 。改进模式树也比 CanTree 要求了更少的记忆。

英文摘要:

By analyzing the existing prefix-tree data structure, an improved pattern tree was introduced for processing new transactions. It firstly stored transactions in a lexicographic order tree and then restructured the tree by sorting each path in a frequency-descending order. While updating the improved pattern tree, there was no need to rescan the entire new database or reconstruct a new tree for incremental updating. A test was performed on synthetic dataset T1014D100K with 100 000 transactions and 870 items. Experimental results show that the smaller the minimum sup- port threshold, the faster the improved pattern tree achieves over CanTree for all datasets. As the minimum support threshold increased from 2% to 3.5%, the runtime decreased from 452.71 s to 186.26 s. Meanwhile, the runtime re- quired by CanTree decreased from 1 367.03 s to 432.19 s. When the database was updated, the execution time of im- proved pattern tree consisted of construction of original improved pattern trees and reconstruction of initial tree. The experiment results showed that the runtime was saved by about 15% compared with that of CanTree. As the number of transactions increased, the runtime of improved pattern tree was about 25% shorter than that of FP-tree. The improved pattern tree also required less memory than CanTree.

同期刊论文项目
同项目期刊论文
期刊信息
  • 《天津大学学报:英文版》
  • 主管单位:中华人民共和国教育部
  • 主办单位:天津大学
  • 主编:龚克
  • 地址:天津市南开区卫津路92号天津大学第19教学桉东配楼
  • 邮编:300072
  • 邮箱:trans@tju.edu.cn
  • 电话:022-27400281
  • 国际标准刊号:ISSN:1006-4982
  • 国内统一刊号:ISSN:12-1248/T
  • 邮发代号:6-128
  • 获奖情况:
  • 天津市一级期刊,被国内外十余家检索机构收录
  • 国内外数据库收录:
  • 俄罗斯文摘杂志,美国化学文摘(网络版),荷兰文摘与引文数据库,美国工程索引,英国英国皇家化学学会文摘
  • 被引量:153