科学引文网络反映了科学知识的动态演化,作为一个复杂的网络系统已得到广泛研究。针对引文网络中主题不明确和热点问题不易跟踪的问题,提出了引文网络中的舆论评价计算方法和网络社区主题提取与表示方法。首先采用正则表达式和模板匹配方法提取元数据;并计算文献作者对参考文献的舆论评价,建立带有舆论评价权值的文献引用网络,并对网络中的突现语义进行描述;然后以此网络结构为基础,结合信息熵和网络中文献重要性权重改进TDIDF算法,计算得到每个社区主题的关键词概率描述,从而得到社区主题。本文的方法和实验对解释引文网络的演化、社区主题发现、文献的共享等有借鉴意义。
Exhibiting the dynamic evolution of scientific knowledge,the citation network has been studied extensively as a complex network system.In citation network,sometimes the topic is not clear and the hot topic is difficult to track.To resolve these problems,we propose the calculation of public opinion assessment index in citation and the extraction and expression of network communities' topics.Firstly,we extract metadata using regular expression and pattern matching. Secondly,we calculate the assessment index of the authors' opinion on the referenced literatures;then a citation network with public opinion value is established,and the emergent semantics of the network is described.Finally,on this network structure,combining information entropy and the weight of document to improve TDIDF algorithm,we calculate the probability of the topics' keywords in each community to get its topic.Experimental results show that the method is significance to the interpreting of the evolution of citation networks,the community topic discovering and the literature sharing.