为了完善与优化共词分析法,对共词分析过程中术语阶段的词源选择、术语规范和高频词选择三个关键问题进行分析和讨论。在词源选择问题中,分析词源的分类及其选择相关研究,将共词分析词源分为元数据描述取词和全文自动标引取词两类;对不同词源的共词分析应用研究进行对比发现,主题词比关键词更适合共词分析,而标题词和关键词对比结果则没有显示明显的差异;讨论词源选择存在的主要问题及其解决思路。在术语规范问题中,论述术语规范的必要性,分析术语规范化的研究现状,将术语规范方法分为基于受控词典和基于人工方式两种方式;讨论术语泛用、术语表达宽泛、术语抽象化与具体化、术语稀少和缺失四类术语规范存在的问题,并提出对应的解决方法。在高频词选择问题中,将国内外高频词选取的相关研究分为四类,对这四类研究现状进行分别分析。本研究能够为从事共词分析相关研究的人员在术语收集阶段提供理论和操作上的借鉴,并达到提高共词分析的可靠性和实效性的目的。
A discussion on three problems of term source selection, vocabulary standardization and high-frequency word selection existed in data collection in co-word analysis process is conducted to improve and optimize co-word analysis. In problem of term source selection, related study of term source category and selection is analyzed, and term source of co- word analysis is classified as the metadata and automatic indexing of full text. Subject word is more suitable than keyword for co-word analysis and comparison result between title word and keyword shows no significant difference with the comparison of application research of different term source. Moreover, term source selection problem and its solution are discussed. In problem of vocabulary standardization, the necessity of vocabulary standardization is described, its methods are divided into dictionary-based and manual-based in current situation analysis of this problem. Four types of problems of ambiguity or synonyms, general term, broad or narrow term and lack or few term and its solutions are investigated. This study ~is able to provide theoretical and operational reference in terms of the collection stage for co-word analysis researcher, and enhance the reliability and effectiveness of co-word analysis.