提出一种基于句子选择的中文自动摘要抽取算法。算法思想是结合单文档的文档结构、篇章结构、句子特征,按照特征优先权过滤,同时利用进化算法良好的自适应性调节特征因子,通过打分函数自动给句子打分排名,选择得分较高的句子作为摘要句。实验采用中文文档数据集进行测试,采用标准的ROUGE—N评估方法,实验表明该算法针对中文文献取得了良好的效果。
An automatic extractive summarization algorithm about Chinese documents based on sentence selection is proposed in this paper. The idea of the algorithm is to combine the document structure, chapter structure and sentence feature of the single document, and to integrate the feature priority based sentence filtering method with genetic algorithm which has good adaptability to seek an optimal combination of sentence's features. With the sentences scored automatically, we generate the summary by extracting some higher score sentences in top according to their original sequence in the document. The Chinese documents downloaded from the internet are adopted in the experiments using the ROUGE-N to evaluate the results. This experiment shows that this method has achieved good results.