随着大数据、“互联网+”时代的到来,互联网美食互动社区的用户原创内容呈爆发式增长,从海量饮食数据中发现自己希望寻找的内容越来越不容易,同时该部分数据没有得到广泛的利用和深度的挖掘;传统的对于饮食行为的研究多采用问卷调查等形式,耗费了大量人力、物力、财力。针对以上问题,提出了基于LDA的用户饮食行为模型:利用LDA模型的思想,分析互联网美食互动社区的用户原创内容,根据困惑度确定主题数,构建用户饮食行为模型,进而可以计算用户饮食行为相似度,以此为美食社区用户进行好友和美食推荐提供模型基础,同时为饮食行为研究提供了一个新思路。以爬虫技术获取互联网美食互动社区上的用户原创内容作为数据集,通过实验验证了这种算法的可行性和有效性。
As the time for big data and "Internet +" era is coming, user generated content of Internet food interactive community is experi- encing the explosive growth. It is becoming more and more difficult for users to find the content of interest. And this part of the data has not been widely used and deeply mined. Traditional eating behavior research normally uses questionnaire, which spends a lot of manpower,rnaterial and financial resources. To solve the above problem,it presents user eating behavior model based on LDA. In order to build this model, the ideas of LDA model is used to analyze user generated content of Internet food interactive community, determining the subject number of model according to the perplexity,then calculating the user similarity of eating behavior, which can provide a basis of recommending friends or food for community users. It also provides a new way of eating behavior research. The user generated content from a Internet food interactive community is collected as data set. The experiments verify the feasibility and effectiveness of this method.