隐私保护是数据挖掘领域中一个极其重要而富有挑战性的课题,以实现隐私数据的保护和准确知识的挖掘两者兼得为其最终目标.统计回归是数据挖掘的常用工具之一,而数据分布式存储情况下统计分析的研究工作甚少.由于机密性或其他原因,数据拥有者往往不情愿与其他合作方分享原始数据,去又希望与其他合作方共同协作执行统计分析.关注于如何解决既获取准确统计分析结果又保护原始数据隐私的平衡问题,基于环同态和离散对数计算困难的思想,建立了隐私保护回归模型,该模型通过同态公钥加密协议的同态性质从而获取准确的统计分析结果.经理论分析和实验证明该协议模型在语义上是安全的和有效的.
Privacy-preserving is one of the most important and challenging issues in data mining field. It can help mining tools mine rules and patterns accurately while preserving the original private information of database. Statistical regression is a common tool in data mining field, but little work has been conducted to investigate how statistical analysis could be performed when data set is distributed among a number of data owners. Due to confidentiality or other proprietary reasons, data owners are reluctant to share data with others, while they wish to perform statistical analysis cooperatively. We address the important tradeoff between privacy and global statistical analysis. In this paper, the authors propose a homomorphous public key protocol based on ring homomorphism and discrete logarithm problem, and then constructe a privacy-preserving regression model, which can obtain accurate statistical results by using the homomorphous character of homomorphous public key protocol. Theoretical analysis and experiment results prove that the protocol and model are secure and effective.