以文献关键词为对象的领域知识分析研究中,依据词频阈值等热度指标筛选的关键词集合会忽视领域特色知识点,因而难以有效揭示领域的研究特征。本文将科研领域放置于其背景学科内,从全局视角考察关键词对领域研究特点的表征能力。通过对比关键词在领域内外的出现概率,提出领域度计算公式,并融合领域度和热度指标进行关键词筛选。以“数字图书馆”为例,构造了图情学科背景语料库和数字图书馆领域语料库,用综合方法、词频方法提取等量关键词。通过定性的对比分析表明综合方法所得关键词集能深入地揭示领域的研究特点;为了克服计量分析结果的主观性问题,本文设计了一种盲选实验,用定量结果论证了新方法的有效性。
Keyword of scientific paper has conventionally been used as research subject when studying the knowledge structure of scientific research fields; the traditional indexes place more weight on the hotspot which may be difficult to characterize the unique feature of research" fields. In this paper, we place the research field in the context of its background discipline, so as to investigate the keyword's representational capacity to a specific research field. A new keyword selection method is proposed base on combining both the popularity index and domain relevancy index of keywords. In the case of Digital Library research in China, we constructed a background corpus of Library and Information Science as well as a domain corpus. Two keyword collections are selected from the domain corpus, one of which is selected by the traditional method based on word frequency, and the other one is selected by our new method. A comparing analysis shows that the new keyword collection reveals the research characteristic of Digital Library research in China better than traditional one. A blind selection experiment is also designed for quantitative analysis. The result shows that our method is more effective.