基于国际专利分类号的层次结构,利用自身的类别描述信息,建立了不同层次的类别特征向量,结合现有专利进行修正训练,分别在各层次上采用经典的KNN算法实现专利的自动分类。实验结果表明:该方法的分类效果在部、大类、小类层次上表现较好。经过修正训练后的分类性能有所提高。
Based on hierarchy and information of International Patent Classification(IPC), this paper constructs character vectors on the different levels. Revision training is done combing with present patents based on the different levels. KNN algorithm is used to realize automated categorization of patent. Experimental results show that the method works well onthe level of section, class and subclass, and the categorization performance is improved after revision training.