传统的共被引研究没有考虑到共被引的两篇文献在施引文献中的位置,只要两篇文献共同被引用,其共被引关系将被等同对待。然而,共被引文献发生在施引文献的不同位置时,其关系强度也有所不同。根据共被引的两篇文献在施引文献中的位置,本文将共被引关系划分四个层次,分别是句子层次共被引、段落层次共被引、章节层次共被引和文章层次共被引,并提出一种基于引用内容相似度的共被引关系权重计算方法,将共被引位置与内容相结合,使共被引关系权重更具客观性和准确性。以3本BMC期刊作为研究对象,计算出共被引4个层次的权重分别为1、0.77、0.64和0.56。在比较加入权重的共被引聚类结果与传统共被引聚类结果后,发现加入共被引权重后的共被引聚类,不仅聚类内文献间关系更紧密,还能更好地揭示施引文献所表达的主题,从而验证了这种赋予权重的共被引分析方法的有效性。
Traditional co-citation analysis has not taken the proximity of co-cited references into account. As long as two references are cited by the same article, they are retreated equally regardless the distance between where citations appear in the article. While the co-citation occurred in different position, the co-citation strength between the co-cited papers was different. In this article, we will consider the proximity of the co-citation at four levels: article level, section level, paragraph level and sentence level, and propose a co-citation weighting method based on the similarity of citation context. After combining citation context and position, the co-citation weight will be more objectivity and accuracy. The co- citation weight of each level will be training from the target dataset according to the similarity of the co-cited context. We chose three BMC journals as the target dataset, and got the weight of four levels as 1, 0.77, 0.64, and 0.56. Then we compared the weighted co-citation clustering with traditional co-citation clustering. The results showed that the weighted co- citation clustering could improve the traditional co-citation on closing the relationship of the cited papers in a certain cluster and identifying the topic of the citing papers. The weighted co-citation clustering method was proved very effective in cocitation analysis.