目前多数l-多样性匿名算法对所有敏感属性值均作同等处理,没有考虑其敏感程度和具体分布情况,容易受到相似性攻击和偏斜性攻击;而且等价类建立时执行全域泛化处理,导致信息损失较高。提出一种基于聚类的个性化(lc)-匿名算法,通过定义最大比率阈值和不同敏感属性值的敏感度来提高数据发布的安全性,运用聚类技术产生等价类以减少信息损失。理论分析和实验结果表明,该方法是有效和可行的。
At present most l-diverse anonymity algorithms are vulnerable to similarity attack and skewness attack due to treating all sensitive attribute values equally and without considering the sensitivity and specific distribution.Moreover,these algorithms result in high information loss on account of performing full domain generalization to create equivalence class.This paper proposes a personalized(l,c)-anonymity algorithm based on clustering,which improves the security through defining sensitivity for different sensitive attribute value and maximal ratio threshold and reduces information loss via clustering technique.Theoretical analysis and experimental results indicate that the method is effective and feasible.