懒散关联分类针对每个待分类实例的特征进行分类关联规则的挖掘,通常能取得较高的准确率。然而,由于某些数据集中存在一些质量不好的特征,将影响懒散关联分类的准确率。此外,分类耗时较长是懒散关联分类另一个缺点。针对上述问题,提出了一种基于信息熵的懒散关联分类算法。该算法以信息熵度量属性值的质量,仅选取每个待分类实例中最好的k个属性值,将得到规模较小且与待分类实例紧密相关的训练子集,从中高效挖掘到高质量的规则。实验表明,与懒散关联分类相比,基于信息熵的懒散关联分类方法提高了分类准确率,并极大减少了运行时间。
Lazy associative classification (LAC) usually achieves high accuracy by focusing on the features of the given test instance. However, the accuracy of LAC is high sensitivity to low quality features. Another disadvantage is that LAC typically consumes more time to classify all test instances. To address these problems, Lazy Associative Classification based on Information Entropy (called ELAC) is proposed in this paper. ELAC use information entropy to measure attribute values and the best k attribute values in each test instance are selected. As a result, a small subset which is high relevant to the test instance is produced from which high quality rules are efficiently minded. Experiments show that ELAC improves the classification accuracy and significantly decreases the test time.