近年来各个领域数据的海量增加促进了数据挖掘的发展,而用户数据的存储与挖掘带来隐私泄露的威胁,需要在数据挖掘过程中保护用户隐私。数据挖掘隐私保护算法研究日益成为重要的研究领域。文章主要介绍数据挖掘隐私保护的3种主要算法即扰动算法、k匿名算法以及关联规则隐藏算法。扰动算法包括随机化扰动算法和乘法扰动算法。k匿名的两种主要技术是泛化和抑制化。常用的关联规则隐藏算法有启发式算法、基于边界的算法和精确式算法。文章介绍了这些算法的最新研究进展,并总结了数据挖掘隐私保护算法的研究趋势。
Nowadays the increasing of massive data in various fields has promoted the development of data mining, but the storage and mining of user data brings about threat of privacy leakage, so the user privacy needs to be protected in data mining process. Research on privacy protection data mining algorithms has become an important research area. This article introduces three main privacy protection data mining algorithms, which are perturbation algorithm, ^-anonymity algorithm and association rules hiding algorithm. The perturbation algorithms include randomization protection algorithm and multiplicative perturbation algorithm. The two main techniques for ^-anonymity are generalization and suppression. The usual association rules hiding algorithms include heuristic algorithm, boundary-based algorithm and precise algorithm. This article introduces and summarizes the new research works for these algorithms, and describes the research trends for privacy protection data mining algorithms.