近年来,科技论文发表数量与日俱增,科研人员需要阅读文献的数量也随之迅速增长.如何快速而有效地阅读一篇科技论文,逐渐成为一个重要的研究课题.另一方面,在阅读科技论文时,理解与其相关的重要参考文献可帮助读者更好地理解文章的内容.然而,如何从众多的参考文献中快速找到最重要、最相关的几篇,如何避免在阅读过程中迷失在文档的多维空间,仍是值得研究的问题.为了解决上述问题,提出了一个基于文本摘要和引用关系的可视辅助文献阅读系统.该系统利用一种基于阅读目的的文本摘要技术提取出论文中重要的句子,并采用多尺度的可视化方式进行展示;使用LDA(latent dirichlet allocation)话题模型抽取参考文献的核心话题;记录用户的阅读行为,用于提示其阅读上下文,以保证用户关注点不发生迷失.同时,在一个具体的案例场景中详细介绍了系统的使用方法,并进行了用户研究以验证系统的可用性.
With growing volume of publications in recent years, researchers have to read much more literatures. Therefore, how to read a scientific article in an efficient way becomes an importance issue. When reading an article, it's necessary to read its references in order to get a better understanding. However, how to differentiate between the relevant and non-relevant references, and how to stay in topic in a large document collection are still challenging tasks. This paper presents GUDOR(GUided DOcument Reader), a visualization guided reader based on citation and summarization. It(1) extracts the important sentences from a scientific article with an objective-based summarization technique, and visualizes the extraction results by a multi-resolution method;(2) identifies the main topics of the references with a LDA(Latent Dirichlet Allocation) model;(3) tracks user's reading behavior to keep him or her focusing on the reading objective. In addition, the paper describes the functions and operations of the system in a usage scenario and validates its applicability by a user study.