提出了一种新的文档自动摘要方法,利用非负矩阵分解算法将原始文档表示为若干语义特征向量的线性组合,通过相似性计算来确定与用户查询高度相关的语义特征向量,抽取在该向量上具有较大投影系数的句子作为摘要。在此过程中,多次采用相关反馈技术对用户查询进行扩展优化。实验表明,该方法所得摘要在突出文档主题的同时,体现了用户的需求和兴趣,有效改善了信息检索的效率。
This paper proposed a new document automatic summarization method using non-negative matrix factorization(NMF) and relevance feedback(RF).Firstly,represented the original document as the corresponding semantic feature vector by NMF and the extended user's initial query Key words:by RF,then extracted highly correlative sentences as a summary by calculating the similarity between expand query and semantic feature vector.Experiments show that the final summary can embody the document theme and reflect the user's need and interest,improve the efficiency of information retrieval effectively.