进行实验。[结果/结论]实验表明,识别准确率最高达到95.38%,但召回率较低;训练语料规模对性能影响较大,不同程度的语义泛化方法对准确率和召回率有复杂影响。如何选择语义特征、语义标注和语义消歧是需要解决的新问题。
[ Purpose/significance] Theory recognition in the academic journals is a precondition for content analysis, so the automation of theory recognition can improve the efficiency of content analysis. [ Method/process] This paper regards theory recognition as named entity recognition, reviews the existing named entity recognition methods, and propo- ses a theory recognition model based on semantic generalization. Selecting the part of speech, HowNet semantic and other external knowledge, a series of experiments with CRF model on 1822 academic journal papers are conducted. [ Result/ conclusion] The accuracy rate of recognition is 95.38% high, but the recall rate is low; the size of the training texts has a large influence on the performance. Semantic resources can improve the performance, but the recall rate is decreased. How to select the semantic features, semantic annotation and semantic disambiguation has to be solved.