为解决现有基于网格结构的差分隐私二维空间数据划分发布方法可能引起局部划分过细导致查询精度低的问题,提出了基于kd-树的差分隐私二维空间数据划分发布方法—kd-PPDP算法(differentially privacy partitio-ning publication algorithm based on kd-tree)。算法采用了kd-树算法思想,通过启发式地识别网格化后数据分布情况并合并相邻近似网格单元来防止局部划分过细问题,从而减少所添加的噪声,提高查询精度。通过实验对比分析了kd-PPDP算法与现有基于网格结构的划分发布方法的查询误差以及时间效率,结果表明了该算法的有效性和可行性。
The existing differentially privacy two-dimensional dataset partitioning publication approaches might result in over-partitioning of certain local regions thus lower the query accuracy. To solve this problem, a new differentially pri- vacy partitioning publication algorithm based on kd-tree for two-dimensional dataset (kd-PPDP) was proposed. To re- duce the noise generated by the Laplace mechanism and improve the accuracy of query, the new approach which was in- spired by the core thought of kd-tree took the gridding data distribution into account and merged adjacent grid units with similar information heuristically to prevent local over-partitioning. The new approach was compared with the existing differentially privacy partitioning publication approachs based on grid structure in terms of query error and time complex- ity. Experimental results showed that the approach was feasible and effective.