系统地研究了查询词与候选人在文档中的距离和顺序关系对专家搜索算法准确率的影响。首先在概率语言模型的框架下提出了顺序核函数来建模顺序关系证据;然后进一步提出两种对不同关系证据进行统一建模的概率框架,并通过在TREC标准数据集上的对比实验,探索了结合两种关系证据进行专家搜索的可行性。实验结果表明,距离和顺序关系证据对专家搜索系统的准确率提高能力相近,而对它们的适当结合可以获得比单独利用其中任何一种更好的效果。
This paper studied the influence of using the relationship evidences,namely the distance and sequential dependencies between query terms and candidates in a document,to the precision of expert finding algorithms.Specifically,first proposed an order kernel function to model the sequential relationship,and then proposed two probabilistic frameworks to model two kinds of relationship evidences in a unified way.Experiment results show that the distance and sequential evidences achieve comparable performance gains over the baseline and a combination of both can achieve better performance than using any of them alone.