云计算的出现为多个数据拥有者进行数据集成发布及协同数据挖掘提供了更广阔的平台,在数据即服务模式(Daa S,data as a service)下,集成数据被部署在非完全可信的服务运营商平台上,数据隐私保护成为制约该模式应用和推广的挑战性问题。为防止数据集成时的隐私泄露,提出一种面向Daa S应用的两级隐私保护机制。该隐私保护机制独立于具体的应用,将数据属性切分到不同的数据分块中,并通过混淆数据确保数据在各个分块中均衡分布,实现对数据集成隐私保护。通过分析证明该隐私保护机制的合理性,并通过实验验证该隐私保护机制具有较低的计算开销。
The emergence of cloud computing provides a broader platform for multiple data owners to make integrated data publishing and collaborative data mining. In data-as-a-service(DaaS) model, integrated data was deployed in a certain cloud platform with an untrusted service provider. Data privacy leakage has become the challenge hindering application and popularization of Daa S model. For protecting data privacy in the data integration stage, a two-layer privacy protection mechanism for Daa S-oriented application was given, which was independent with the specific applications, partitioning data attributes into different parts. In addition, the corresponding fake data set was used to assure the balanced distribution of data in each part, which realized privacy protection of data integration. The experimental results indicate that the proposed strategy is feasible, simultaneously has the low computing overhead.