最近邻搜索广泛应用于分类问题,其最显著的优点是分类准确率高、泛化性能好.但现有最近邻分类算法都存在着一个弱点——样本集增大分类计算量也显著增大.为了克服这一不足,本文基于一个新的思路,提出了最近邻分类方法的一种改进方法.该方法在进行最近邻分类时,不一定要找到待分类点的最近邻点,而只要知道最近邻点的类别即可,大大地减少了最近邻搜索时的计算量.用经典的分类问题双螺线问题(TSP)以及其他几个例子,就该改进方法的分类效果、分类速度和学习性能等3个方面进行了测试,并与经典的K维双叉树(KD树)最近邻搜索法以及压缩近邻法进行了比较.结果表明,就综合性能而言,本文改进方法是有竞争力的.
The nearest neighbor searching method is widely used in classification problems for its good classification ability and generalization performance.But the current algorithms for nearest neighbor searching method always suffer from huge calculation problems when the number of the samples is increased.To overcome this shortcoming,an improved classification algorithm based on the nearest neighbor searching method is proposed in this paper.In this algorithm,when the nearest neighbor searching method is executed,one just needs to find the class label of the nearest neighbor but not the nearest neighbor itself,thus less calculation resources and calculation time are needed.The classification ability,classification speed,and trainability of this improved algorithm are tested on the two spirals problem(TSP) and some other problems.And it is also compared with the KD-tree(K dimensional binary search trees) nearest neighbor searching algorithm and the condensed nearest neighbor rule.The results,generally speaking,show that the improved algorithm proposed in this paper is quite competitive.