针对用户删除Cookie导致的Web日志用户标志不准确的情况,提出了一种基于支持向量机的用户标志修正算法。首先训练一个分类器判断两个会话是否属于同一个用户,然后计算两个不同标志用户之间的相似度,最后将日志进行分组,发现所有删除Cookie的用户并进行标志的修正。通过实验验证了算法的有效性。
Aiming at the problem of Cookie deletion led to user identifier inaccuracy,a correction algorithm based on Support Vector Machine(SVM) was proposed to identify users.A classifier was trained firstly to judge whether two sessions belonged to a same user,and then the similarity between different user identifiers was calculated.Finally,Web logs were divided into groups to find all users whose Cookie had been deleted,and made identifier correction.Experiment results verified that the proposed method was effective.