话题跟踪属于话题识别与跟踪(TDT)的一项子任务,是一种基于事件的信息组织技术.话题跟踪任务就是根据某一话题的训练报道,在后续报道中找出讨论该话题的所有报道.虽然传统的基于内容计算的话题跟踪方法也可以应用于Web话题跟踪,但它并没有利用Web的页面特征.文章针对Web页面的特点,提出了一种利用链接分析和内容计算相结合来进行Web话题跟踪的方法.实验证明这种方法是有效的.
Topic Tracking has grown out of the Topic Detection and Tracking (TDT). It is an event-based information organization task, and it links documents with previously detected events. Although the traditional topic tracking approaches based on content computing are often applied to web topic tracking, it does not utilize the valuable properties of Web page. This paper presents an approach combining the hyperlink analysis with content computing. The experiment shows that the approach improves the quality of topic tracking.