已有的经济关系研究大都采用实证的或单纯的计量学的方法来实现的。本文则针对非结构化的文本特点,采用信息抽取和文本挖掘方法挖掘用户感兴趣的区域经济关系是具有十分重大应用价值的研究课题。本文在探讨了基于实体关系的文本挖掘机制的基础上,对31个省、市、自治区的区域经济关系进行了分析。运用文本挖掘技术对经济关系的挖掘包括两种方式:一是基于属性的经济关系挖掘,利用信息抽取获取各个实体属性,采用聚类方法分析经济实体关系;二是基于相互引用的经济关系挖掘,首先构造经济实体关系分类词典,提出了实体关系标注算法,利用信息抽取获得实体之间的引用情况,然后构造关系有向图,从中挖掘区域经济之间的关系。研究表明,运用文本挖掘技术,既可以对各个区域经济发展状况进行分析和评价,也可以发现特定区域经济之间的内在关系。
Text mining plays an important role in knowledge acquisition, and it is valuable issue to apply information extraction and text mining to mine relations among entities from non-structure texts in the internet. In this paper, the approach of text mining for relations between named entities is presented, and it includes two mining schemes. One is based on the attributes of entities. It applies the approach of information extraction to collect their attributes, and then adopt the clustering algorithm to analyze the relations between named entities. The other is based on the reference between entities. It constructs the relation dictionary and presents the algorithm of annotating relations. It set up the vector-graph based on the references between entities, and it derives several interesting information patterns from the vector-graph. As a result, it shows a better effect on mining the relationship between named entities from a specific domain.