本文根据中文专利权利要求书的特点,提出了基于SAO结构的中文相似专利识别算法。首先对权利要求书进行依存句法分析和语义角色标注,从中抽取出SAO(主-谓-宾)结构。其次计算SAO结构之间的相似度,由SAO结构的相似度得出权利要求书相似度,并对结果进行多维尺度分析(MDS)和聚类分析,判断专利间相似性。最后将该方法运用到专利无效中,取得了良好的效果。此外,本文需要进一步提高SAO结构抽取的准确性,也需要提高该方法在实际应用中的有效性。
According to the characteristics of Chinese patent claims, this article proposes a method of Chinese similar patents recognition based on SAO structures. First, based on the dependency parsing analysis and semantic role labeling for patent claims, we extract SAO (Subject-Action-Object) structures. And then, this article calculates the similarity between SAO structures, and obtains patent claims similarity based on SAO structures similarity. The paper takes the multidimensional scaling (MDS) and cluster analysis on the above resuhsto judge the similarity between patents. Finally we apply the method to patents invalidity and achieve a good result. Additionally, this article needs to further improve the accuracy of extracted SAO structures, but also to strengthen the effectiveness of the method in practical applications.