本文对传统预处理步骤和方法进行了分析,针对其所需解决的两个基本问题:用户的准确识别与追踪和本体化的Web信息与基础日志数据的结合,提出了适合网络社区用户兴趣挖掘研究的数据预处理步骤与方法,弥补传统Web使用挖掘数据预处理在这方面的不足。
The traditional data preprocessing procedures and methods are analyzed.In view of the two basic problems they have to solve:the accurate identification and tracking of users,and the combination of the ontologybased Web content and server log,this paper proposes some data preprocessing procedures and methods for the mining of user interest in Web community so as to make up the deficiency of the traditional data preprocessing for Web using mining.