频繁模式是频繁出现在数据集中的模式,在数据挖掘中起着非常重要的作用。针对恒星光谱分类任务,在频繁模式的基础上,提出一种基于分类模式树的恒星光谱分类规则挖掘方法。首先根据数据库中恒星光谱各属性出现的频率不同,其在分类中的重要程度也不同的特征,提出一种新的树型结构——分类模式树,给出了相关概念及其构造方法SSCPTC,然后,将恒星光谱的特征信息映射到分类模式树上,通过采用自顶向下和自底向上两种模式相结合的方法对分类模式树进行遍历,实现分类规则的提取,同时引入模式有用度的概念来调整分类规则的数量、提高分类模式树的构造效率;最后采用国家天文台提供的SDSS恒星光谱作为实验数据,验证了该方法的正确性,而且具有较高的分类正确率。
Frequent pattern, frequently appearing in the data set, plays an important role in data mining. For the stellar spec- trum classification tasks, a classification rule mining method based on classification pattern tree is presented on the basis of fre- quent pattern. The procedures can be shown as follows. Firstly, a new tree structure, i. e. , classification pattern tree, is intro- duced based on the different frequencies of stellar spectral attributes in data base and its different importance used for classifica- tion. The related concepts and the construction method of classification pattern tree are also described in this paper. Then, the characteristics of the stellar spectrum are mapped to the classification pattern tree. Two modes of tOlyto-down and bottom-to-up are used to traverse the classification pattern tree and extract the classification rules. Meanwhile, the concept of pattern capabili- ty is introduced to adjust the number of classification rules and improve the construction efficiency of the classification pattern tree. Finally, the SDSS (the Sloan Digital Sky Survey) stellar spectral data provided by the National Astronomical Observatory are used to verify the accuracy of the method. The results show that a higher classification accuracy has been got.