位置:成果数据库 > 期刊 > 期刊详情页
基于概率衰减窗口模型的不确定数据流频繁模式挖掘
  • 期刊名称:计算机研究与发展
  • 时间:0
  • 页码:4157-4163
  • 语言:中文
  • 分类:TP311.13[自动化与计算机技术—计算机软件与理论;自动化与计算机技术—计算机科学与技术]
  • 作者机构:[1]江西财经大学信息管理学院,南昌330032, [2]江西省高校数据与知识工程重点实验室,南昌330032, [3]江西省赣抚平原水利工程管理局,南昌330201
  • 相关基金:国家自然科学基金项目(60863016);江西省自然科学基金项目(2008GQS0019);江西省教育厅科技重点基金项目(GJJ10694,GJJ12259);江西省教育厅青年科学基金项目(GJJ10119);江西省优势科技创新团队建设计划基金项目(20113BCB24008)
  • 相关项目:嵌入式移动计算环境下实时数据库自适应及动态恢复策略
中文摘要:

考虑到不确定数据流的不确定性,设计了一种新的概率频繁模式树PFP—tree和基于该树的概率频繁模式挖掘方法PFP—growth.PFP—growth使用事务性不确定数据流及概率衰减窗口模型,通过计算各概率数据项的期望支持度以发现概率频繁模式,其主要特点有:考虑到窗口内不同时间到达数据项的贡献度不同,采用概率衰减窗口模型计算期望支持度,以提高模式挖掘准确度;设置数据项索引表和事务索引表,以加快频繁模式树检索速度;通过剪枝删除不可能成为频繁模式的结点,以降低模式树的存储及检索开销;对每个结点都设立一个事务概率信息链表,以支持数据项在不N事务中具有不同概率的情形.实验结果表明,PFP—growth在保证挖掘模式准确度的前提下,在处理时间和内存空间等方面都具有较好的性能.

英文摘要:

In recent years, a large amounts of uncertain data are emerging due to the wide usage of new technologies such as wireless sensor networks and radio frequency identification. Considering the uncertainty of uncertain data streams, a new kind of probability frequent pattern tree--PFP-tree and a probability frequent pattern mining method PFP-growth are proposed in this paper. PFP-growth uses transactional uncertain data stream model and a time-based probability decay window model to find probability frequent patterns through calculating expected supports. The main characteristics of PFP-growth include. 1)Because the contributions on the expected supports of items arriving at different time within a window may be different, a time-based probability decay window model is used to improve mining precision ratios; 2)In order to enhance retrieval speed on PFP-tree, an item index table and a transaction index table are designed; 3)A pruning algorithm is designed to delete the nodes which are not possible to be frequent patterns, to reduce greatly the overhead of both time and space; 4)A transaction probability list is set for every node to meet the requirement that some data items may have different probabilities in different transactions. Experimental results have shown that the PFP- growth method can not only ensure a higher mining precision ratio, but also need less processing time and storage space than the existing methods.

同期刊论文项目
同项目期刊论文