旨在梳理国际文本挖掘研究的知识体系,通过识别研究主题群及基于时序分析的研究热点演化趋势分析来从宏观上把握学科领域的发展脉络。以SCI和SSCI数据库中2000-2015年的2 447篇文本挖掘相关主题的研究文献为样本,利用SATI软件生成关键词共现矩阵并采用VOSviewer聚类技术创建相似矩阵和二维地图,识别出国际文本挖掘研究的六大主题群并进行各主题群的演化趋势分析。研究结果表明,国际文本挖掘研究主题呈现多元化、交叉学科的特点,在信息检索、生物医学和经济管理领域应用广泛。算法和技术上信息抽取、自然语言处理及机器学习等占较大比重。此外,文本挖掘研究正逐步细化,意见挖掘、情感分析等研究较受重视。
To study the knowledge system of international text mining, this paper indentifies the research topic groups and analyzes the evolution tendency based on time sequence analysis to grasp the development choroid of this field. This paper chooses 2 447 papers about text mining and its relative topics from SCI/SSCI databases with the time span of 2000 -2015. VOSviewer clustering technique is used to construct similar matrix from co-word matrix which is generated from SATI. Then a two-dimension map is created to identify the six research groups in text mining research field. After that, the evolution of research groups is analyzed. Results show that, the research topics of international text mining are diversified and interdisciplinary, which are wildly applied in the field of information retrieval, biomedicine and business management. Information extraction, natural language processing and machine learning play an important role in related techniques. Furthermore, the research topics of text mining are more and more refined, and opinion mining and sentiment analysis are being valued.