提出了一种基于规则的试卷文本语块识别方法,有效解决了试题库中大规模试题数据的初始化问题。通过定义文本语块识别规则,构建自动机识别模型,在理论上描述了试卷文本的识别过程。实验表明,该模型具有良好的性能,在此基础上,实现了一个原型系统,通过具体的应用实例验证了该方法的可行性和有效性。
To solve the initiating of massive examination questions in database efficiently, proposed a paper texts chunking method based on rules. Defining recognition rules of paper texts and constructing automata recognition model, described the recognition processing of paper texts theoretically. Experiment results show that this model has better performance. By these works, implemented a prototype system, and verified its feasibility and effectiveness by a practical application.