文章对基于文本、Token和抽象语法树的同源性检测技术进行探讨,详细介绍了关于抽象语法树的同源性检测技术。同时,在对实际应用大量研究的基础上,文章着重介绍了源代码同源性检测系统的架构设计,以及引擎比对、比对结果分析和比对结果输出等主要功能模块,并对开发的系统进行了系统测试和分析,验证了算法的可行性。
This paper discusses the software plagiarism detection technologies which are based on text, token and abstract syntax tree, and especially discusses the technology based on abstract syntax tree (AST) in detail. At the same time, according to the study of software plagiarism detection applications, this paper mainly introduces the architecture of software source code homologous detection system and some kinds of key functional modules. For a demonstration of the feasibility on the proposed algorithm, this paper makes a deep analysis and evaluation of the source code homologous detection system.