研究了如何使用协作分类器(协作使用条件随机场(CRFs)和支持向量机(SVM))解决领域概念实例、属性及属性值的抽取以及它们三者之间对应关系预测的问题.首先将概念实例、属性及属性值看作三类实体,把概念实例、属性及属性值的抽取问题转化为命名实体识别问题,利用条件随机场建模进行命名实体识别;在此基础上定义实体间对应关系,对概念实例、属性及属性值三者的对应关系做预测,把概念实例、属性与属性值三者之间存在关系的向量标记为1,否则标记为0,利用支持向量机建模进行关系的预测.且以云南旅游景点概念实例、属性及属性值进行六组相关的实验.实验表明,在开放测试中协作分类器精确度达到84.4%、召回率达到82.7%及F值达到为83.6o.4,相比于词语共现F值提高了20个百分点.
This paper studies how to use the Collaboration Classifier (Conditional Random Fields (CRFs) and Support Vector Machine (SVM)) to solve the extraction and relation prediction problem of ontology concept instance, attribute and attribute value. Firstly, taken concept instance, attribute and attribute value as three entities, the problem of extraction these three entities was converted to a named entity recognition problem, CRFs classifier model was adopted to recognize entities; Furthermore, made a definition for the relations between the concept instance, attribute and attribute value and made relations prediction among concept instance, attribute andattribute value after they were identified respectively, if there is a relationship among the concept instance, attribute and attribute value, marked 1, otherwise marked 0, then use SVM classifier model to make predictions on entity corresponding relation. Taking six trials on concept instance, attribute and attribute value on Yunnan tourist attractions for instance, the experiment is done to make that the accuracy rate of Collaborative Classifier achieves 84.4% and recall rate is up to 82.7% and the F score is 83.6% ,compared to Words Co-occurrence model, its F- score increased by 20%.