Web日志挖掘通过挖掘网站服务器中的日志记录来分析用户行为,从而优化网站结构,提高用户满意度.随着互联网的不断发展,网站服务器日志内容数量急剧增长,数据分析的效率亟待提高.文中在分析传统Web日志数据研究方法的基础上,从减少数据维度的角度出发,提出了基于构建语义化日志的聚类方法,并通过计算Davies-Bouldin值对该聚类方法的有效性进行了验证.
Web log mining is used to analyze user's behavior through mining the log records in web servers,and consequently website structure is optimized,customer satisfaction being enhanced. With the continuous development of the internet,the number of web server log contents is increasing rapidly,and the efficiency of data analysis needs to be improved. The paper analyzes the traditional research methods of web log data,proposes,from the perspective of dimensionality reduction,a clustering method based on constructing the semantic log,and tests the efficiency of the method through calculating the value of Davies- Bouldin.