通过对Web日志中用户访问模式规则抽取方法的研究,提出一种基于云理论的Web日志定性规则提取方法。该方法分析了影响用户兴趣度的时间因素,利用云模型表示关联规则挖掘中支持度和置信度的“软阈值”,采用云变换过程来实现各页面停留时间定性概念的划分,克服了边界过硬的问题。与传统方式相比,该方法挖掘出的规则是一种基于时间概念的多条件多规则的定性描述形式,能够灵活地反映Web用户访问模式的规律性。
A new extraction method of qualitative rules of Web log based on cloud theory is proposed after studying the extraction method of user access mode in the Web log. The new method analyses the time factor which influences users' interest degree and uses the cloud model to define the "soft threshold" of support degree and confidence degree in associated rules mining. The cloud transform process is used to realise the division of qualitative concepts for the retention time of each webpage, in this way the problem of over-hard boundary has been overcome. Comparing with traditional ways, the rules mined by the new method is a qualitative expression form based on multiple condition and multiple rule of time concept,it is able to flexibly reflect the regularity of Web user access mode.