文章以新浪微博中用户标签作为研究对象,从微博中收集用户基本信息与用户标签信息,依据用户标签分类体系对用户标签进行人工分类;然后分析标签类型、标签类型分布熵、用户平均标签个数、用户平均标签长度等标签标注行为指标在不同学科领域中的差异,以及从高频和不同标签个数分组两个角度分析上述行为指标在不同学科领域的差异。研究表明,标签类型、平均标签个数在不同学科领域中有显著性差异;不同学科领域高频标签中,标签类型存在较大差异;在不同标签个数分组下,用户标签类型在不同学科领域下无明显差异,用户的平均标签长度随着个数的增多呈递减趋势。
This paper studies user tags of Sina Weibo. By collecting users' profiles and their tagging information,tags are classified manually according to tags classification system;then analysis is made of the differences in tag types,the distribution entropy of tag types,the average number of user tags,and the average length of tags in different domains. Tagging behavioral indicators are also compared according to high frequency and tag number.The study finds that there are significant differences in tag types and average tag number in different domains;and there are large differences in types of high-frequency words. Grouped by the numbers of different tag types,no obvious differences are showed in user tag types of different domain sand the average length of user-generated tags decreases with an increase in the number of tags.