提出一种基于最大频繁模式、模式相似与属性描述相结合的多维序列模式挖掘算法MSP,该算法包括3个步骤:挖掘数据集中的最大频繁模式,每个频繁模式成为一个模式类;比较数据中各序列项序列与各模式类的包含与相似关系;按照一定的规则抽取与各模式类相关的属性,给出以属性为前件、模式类为后件的多维序列规则为形式的多维序列模式挖掘结果。对算法进行分析表明,该算法是有效的,且具有较好的可扩展性。
An efficient algorithm for multidimensional sequential patterns mining is proposed based on similar sequences and attributes description, which consists of three processes: at first, mining the maximal frequent sequential, then clustering the sequential patterns according to the similarity between them, and finally, giving the results of multidimensional frequent sequential mining with the forms of sequence rules. Algorithm analysis shows that it is efficient and scalable.