近年来,随着代码复用技术不断成熟和Internet上开源项目不断丰富,软件开发人员的开发行为也逐渐发生了变化。如今,软件开发人员在编程过程中越来越多地依赖于开源软件项目提供的功能。然而,在软件复用活动中,由于开源项目文档的不全面以及代码结构的复杂性,软件开发人员往往只能片面地了解项目的某些功能点,使得复用效率不高。针对开源项目代码丰富而文档较少这一现状,提出了一种基于LDA(Latent Dirichlet Allocation)和静态分析的代码功能识别方法,对传统LDA方法进行了扩展,帮助软件开发人员更全面地了解项目的功能点,从而更好地支持代码复用活动。
In recent years, with the rapid development of code reuse technology and open source projects on Internet, software developers' programming activities are gradually changed. Today, software developers increasingly rely on the functions supplied by open source projects while they're programming. However, due to the lack of documents and the complexity of code structure, the efficiency of software reuse is not high. Software developers usually only learn small parts of project's functions instead of comprehensive understanding. In order to better support the activity of code reuse, a function recognition approach based on LDA and code static analysis technology, which is an extension of traditional LDA, is proposed to help developers better learn the functions of a project.