主题搜索引擎是专为查询某一学科或主题信息而出现的查询工具。针对目前各种主题搜索引擎在主题搜索上的优缺点,提出将基于文字内容启发的超链接引导技术与基于Web链接图的PageRank算法相结合的IPageRank?IND算法,以提高链接相关度判断的准确性和主题资源搜索的覆盖率,并将网页按照VSM算法进行内容相关度判断和自动分类,从而提高检索效率。最后构建一个搜索引擎进行实验,通过比较该算法与其他几种算法的实验结果,能够看到IPageRank-IND算法的优势是明显的。
Focused search engine is a tool designed to query information on a particular subject or theme information.Considering the advantages and disadvantages of current focused search engine technologies,put forward the IPageRank–IND algorithm that combining the hyperlink–induced technology based on text-inspired with the PageRank algorithm based on web structure analysis to improve the accuracy of relativity judgment and the coverage of focused resources research,and classifying the web page by sub-topic in order to retrieve efficiently.Then,experiment with a search engine to build,to compare the algorithm with several other algorithms,see the advantage of IPageRank-IND algorithm is obvious.