在隐私保护的数据发布研究中,目前的方法通常都是先删除身份标识属性,然后对准标识属性进行匿名处理.分析了单一个体对应多个记录的情况,提出了一种保持身份标识属性的匿名方法,它在保持隐私的同时进一步提高了信息有效性.采用概化和有损连接两种实现方式.实验结果表明,该方法提高了信息有效性,具有很好的实用性..
In the research of privacy preserving data publishing, the present method always removes the individual identification attributes and then anonymizes the quasi-identifier attributes. This paper analyzes the situation of multiple records one individual and proposes the principle of identity-reserved anonymity. This method reserves more information while maintaining the individual privacy. The generalization and loss-join approaches are developed to meet this requirement. The algorithms are evaluated in an experimental scenario, reserving more information and demonstrating practical applicability of the approaches.