k-匿名隐私保护模型中,k取值同时影响着k-匿名表的隐私保护程度和数据质量,因此,如何选择k值以达到隐私保护和数据质量的共赢具有重要意义.在对k取值和隐私保护、数据质量关系分析和证明的基础上,根据不同情况下的k-匿名表隐私泄露概率公式,对满足隐私保护要求的k取值范围进行了分析;根据k-匿名表的数据质量公式对满足数据质量要求的k取值范围进行了分析.根据满足隐私保护和数据质量要求的k取值之间的关系,给出了k值的优化选择算法.
In k-anonymity model the value of k can affect the privacy protection degree and data quality of the k-anonymous table synchronously.So,how to choose k-values is very important in k-anonymity model in order to achieve win-win situation of privacy protection degree and data quality.To solve this problem,the connections among k-values,privacy protection and data quality are analyzed firstly.Then,the range of k-values which satisfies privacy protection request is analyzed basing on the privacy disclosure probability formula,and the range of k-values which satisfies data quality request is analyzed basing on the data quality metric.At last,basing on the relationship between k-values which satisfies privacy protection request and k-values which satisfies data quality request,the selection algorithm for optimized k-values is presented.