隐私保护数据挖掘是当前数据挖掘领域中一个十分重要的研究问题,其目标是在无法获得原始明文数据时可以进行精确的数据挖掘,且挖掘的规则和知识与明文数据挖掘的结果相同或类似。为了强化数据的隐私保护、提高挖掘的准确度,针对分布式环境下聚类挖掘隐私保护问题,结合完全同态加密、解密算法,提出并实现了一种基于完全同态加密的分布式隐私保护FHE—DBIRCH模型。模型中数据集传输采用完全同态加密算法加密、解密,保证原始数据的隐私。理论分析和实验结果表明,FHE—DBIRCH模型不仅具有很好的数据隐私性且保持了聚类精度。
Privacy preserving is one of the most important topics in data mining. The purpose is to discover accurate rules and knowledge without precise access to the raw data. Its mining rules and knowledge are the same or similar with the plaintext data mining results. In order to enhance privacy preservation and improve data mining accuracy, the paper focuses on the privacy preserving problem of clustering data mining in a distributed environment, combines fully homomorphic encryption and decryp- tion algorithms, and proposes a fully homomorphic encryption algorithm based on the FHE-DBIRCH model. The model ensures data privacy when data transmission uses fully homomorphic encryption and decryption. Theoretical analysis and experimental results show that the FHE-DBIRCH model can provide better privacy and accuracy.