该文首先分析了粒度计算的研究现状和原覆盖算法里面存在的缺点,即测试样本识别时拒识的概率较大以及当所得的覆盖存在交叉时,测试样本的类别确定问题,然后应用基于商空间的粒度计算理论针对覆盖算法的第一个缺点进行改进和优化,即对覆盖算法中的拒识样本进行二次处理。通过改变处理问题的粒度,使覆盖粒度在由粗到细的变化过程中,实现对拒识样本的渐进识别,在更细的空间中减少拒识的样本数,提高识别率。最后在中文文本数据库中使用优化后的覆盖算法,该数据库已进行过预处理。实验结果显示,这种优化后的方法减少了测试样本识别时的拒识样本数量,降低了识别样本时的出错率,有效地提高了实验结果的精度。
The author analysis two shortcomings of Covering Algorithm, that is, the high rate of refused samples and the class which are in the cross of coverage belong to. Then the author apply the granular computing theory based on quotient into the improvement and optimization of the first shortcoming of covering algorithm, that is, classify the refused samples in covering algo-rithm again. In the course of change of granular from big to small by using the different granular of classifying the samples, the au-thor classify the refused samples gradually and improve the classified correct rate by reduced the refused samples in the smaller granular. The author apply the optimized Covering Algorithm in Chinese Text Database which has been cut into words. The computer experiments show that this method reduce the number of refused samples and improve the correct rate of test samples by decreasing the error rate in the test.