该文针对基于概念格的大规模数据和规则挖掘中,概念节点数呈指数爆炸的问题,提出了概念覆盖度函数和概念格度量模型,进行概念格约简,从而使生成的标示概念格具有线性空间复杂度。给出了概念格约简的直求法、同步法和提取法3种算法。时空复杂度分析和仿真试验表明,所提方法可以大幅约简概念格规模,从而显著提高建格和规则挖掘效率。标示概念还具有特殊含义,在Web服务关系挖掘中有很好的应用。
To address the lattice size exponential explosion problem in large scale data and rule mining, concept coverage density function and measurement model are introduced to reduce redundant concepts. The pruned lattice, named marked-concept lattice, has linear space complexity and can be obtained through direct or synchronous construction or node-extraction. Analysis and simulation tests show that this reduction model not only significantly reduces normal concept lattice size, but also significantly improves lattice building and rule mining efficiency. Furthermore, marked concept carries crucial information and physical meanings, thus can make benefits for Web service relationship mining.