正确辨识网络账号的马甲关系,能够维护网络环境的安全与和谐,抑制网络中不法行为和虚假信息.基于文本挖掘的作者身份识别一直受到广泛关注,但对社交网络中文本作者关系鉴别的研究较少,该文提出了一种社交网络账号的马甲识别方法,基于网络语言的风格和账号关系,分别提取网络文本特征和账号之间的回复关系频次两组特征构成特征集合,同时基于账号组合构建训练样本向量空间,鉴别网络账号的马甲关系.结合论坛数据对所提方法进行了实验验证,准确率达到80%,结果表明该方法具有较高的马甲辨别准确率.
Real name registration suffers great difficulties in social network and it is a world-wide issue. Some users use multiple IDs (usually called “sock-puppet”) to publish disharmonious views in order to reach illegal attempt such as to start or spread a rumor. It's important to figure out a way to identify these users. In this paper, we propose to extract featuresfrom text data and social relation data, and train a novel vector-space-model based on the combination of different IDs to detectthe sock-puppet relation. In the experiment of the forum data, we achieved 93% of classify precision. The result verified the effectiveness of the proposed method.