新成员在参与软件项目开发和维护系统时,往往需要花费大量时间去理解系统的结构和功能,为了加速新成员对系统的理解,通常优先推荐他们关注一些系统中更重要的类。大量研究表明软件系统具有明显的复杂网络拓扑形态,可以将软件系统抽象为软件网络模型,通过网络节点重要性度量方法识别软件系统中更重要的类,辅助新成员快速掌握系统的核心结构和功能。目前,关于网络节点重要性度量的方法很多,大多数方法仅考虑邻居节点的度或边的权重。另外,h指数作为一种成功用于定量评估研究人员学术成就的指标也很少应用于软件网络中重要类的识别。作者以Ant、Jung和Maven项目为研究对象,构建对应的加权软件网络模型,结合节点的度和连边的权重信息提出H-NWD、A-NWD和G-NWD 3个h指数的变体指标来度量软件系统中类的重要性,并与已有的度中心性、介数中心性、接近度中心性、特征向量中心性、Page Rank中心性5个常用的复杂网络中心性度量指标进行对比。实验结果表明,本文所提的H-NWD和G-NWD指标与已有的度量指标交集达到80%以上,能够很好地识别软件系统中重要类;在确定类的修改情况下,H-NWD指标与度中心性、特征向量中心性、Page Rank中心性共同识别的重要类节点rank值更靠前,且被识别的其他类节点修改更频繁,相比于已有指标在识别关键类上更准确。
When new members were involved in the development and maintenance of software projects,they usually need to spendmuch time to understand the architecture and function of the system.To help them understanding a software systemand quickly grasp the system,somekey classes were in general given priority to be recommended as soon as possible.A large number of studies have shown that the software system has aclear form of complex network topology.Therefore,we could build software network models,and then identified important classes in software systems by means of network node importance measurement,so as to help new members to master the core structure and function of the system quickly.Previously,there were many methods for measuring the importance of node in a network.But most methods considered only the degree of neighbor node or the weight of edge.As a metric successfully applied to evaluate the productivity of a scholar,little was known about whether hindex was suitable to identify key classes in weighted software network.In this paper,based on the degree of node and the weight of edge,three variations of h-index(i.e.,H-NWD,A-NWD,G-NWD) were proposed to measure the importance of the classes on three open-source software projects(i.e.,Jung,Ant,and Maven) built by corresponding model of weighted software network.The feasibility of proposed measures was validated by comparing them with the five existing centrality measures of complex network(i.e.,degree centrality,betweenness centrality,closeness centrality,eigenvector centrality and page Rank centrality).The results showed that the proposed index of H-NWD and G-NWD was effective in identifying the key classes,and the intersection reached more than 80% with the existing metrics.In the case of determining class modifications,the rank value of important class nodes identified by H-NWD was much higher and the other class nodes identified by H-NWD were modified more frequently.Compared to existing indicators,it was more accurate in identifying key classes.