k匿名方法是一种重要的数据隐私保护方法。在应用k匿名方法保护社会网络中用户的隐私时,现有的方法对社会网络的结构进行匿名化处理,当攻击者了解了网络的文本信息后可以很容易识别出用户的身份等隐私信息。为此,提出一种包含结构和文本的k匿名方法。该方法在采用传统的节点度匿名化的基础上,将社会网络中的文本信息分成不同的值域,对每一个值域构建一个全局的层次结构树,对所有的值域采用集合枚举树来优化文本标记泛化时的信息丢失,并针对集合枚举树的特征提出了三种剪枝方法。实验表明,提出的k匿名方法在实现了社会网络的结构和文本匿名化的同时具有较低额的开销。
k-anonymity is an important approach for protecting users' privacy. While protecting user' s privacy in social networks, current approaches usually anonymize the structure of social networks, and when attackers know more about the text of network, they can easily recognize users' identification. In order to solve this problem, this paper proposed a k-anonymity approach containing both structure and text in a social network. Based on traditional k-anonymity approach for structure, the proposed approach classified text in edges into different domains, constructed a global hierarchy tree for each domain, applied setenumeration tree for all domains to optimize the information loss while generating text notations, and proposed three pruning strategies according to attributes of the set-enumeration tree. The experiments show that, the proposed k-anonymity approach has low execution cost while implementing the k-anonymity approach containing structure and text in social networks.