面向大学英语写作教学的自动作文评分要求评分方法具有针对非特定作文题目的通用性。在作文内容评价方面,文本聚类能够把作文按内容的相似程度聚集到一起,从而形成一棵内密外疏的聚类树。位于聚类树外围的少数与其它作文内容差异较大,即可能跑题的作文可以反馈给教师进行人工判断,从而花费较少的人力即可做出较准确的作文内容评价。实验表明,通过设置合理的相似度阈值,该方法能够有效识别跑题作文:
The automated essay scoring for the teaching of college English writing requires that the scoring method should have the feature of generality,namely,without pertinency of specific subjects.In the aspect of content evaluation,document clustering can put eassys together according to the similarity of their contents to form a clustering tree which has a higher similarity in the core than in the peripheral area of the tree.A few essays that locate in the peripheral area are quite different from most others in content.These essays are possibly off the topic and will be submitted to teachers for further examination. By this way,eassy contents can be evaluated accurately with only minor labor expense.Experiment shows that this method can identify essays off the topic effectively with a reasonable threshold value of content similarity.