基于分组的差分隐私直方图发布得到了研究者的广泛关注,组均值造成的近似误差与噪音造成的拉普拉斯误差之间的均衡直接制约着直方图发布精度.针对现有基于分组的直方图发布方法难以有效兼顾近似误差与拉普拉斯误差的不足,提出了一种满足差分隐私的精确直方图发布方法 DiffHR (differentially private histogram release );通过分析直方图桶计数序列的排序有助于提升发布精度,利用 Markov 链蒙特卡洛(Markov chain Monte Carlo ,MCMC)方法中的 Metropolis-Hastings 技术与指数机制,提出了一种有效排序方法,通过不断置换2个随机选取的桶以逐渐逼近正确排序;基于抽样排序后的直方图,提出了一种基于懒散分组下界的自适应贪心聚类方法,该方法的时间复杂度为 O(n),并且可有效均衡近似误差与拉普拉斯误差.DiffHR ,GS ,A HP 方法在真实数据上的实验结果表明,其发布精度上优于同类算法.
Grouping-based differentially private histogram release has attracted considerable research attention in recent years .The trade-off between approximation error caused by the group’s mean and Laplace error due to Laplace noise constrains the accuracy of histogram release . Most existing methods based on grouping strategy cannot efficiently accommodate the both errors . This paper proposes an efficient differentially private method ,called DiffHR (differentially private histogram release) to publish histograms .In order to boost the accuracy of the released histogram ,DiffHR employs Metropolis-Hastings method in MCMC (Markov chain Monte Carlo ) and the exponential mechanism to propose an efficient sorting method . This method generates a differentially private histogram by sampling and exchanging two buckets to approximate the correct order . To balance Laplace error and approximation error efficiently , a utility-driven adaptive clustering method is proposed in DiffHR to partition the sorted histogram . Furthermore , the time complexity of the clustering method is O(n) .DiffHR is compared with existing methods such as GS ,AHP on the real datasets .The experimental results show that DiffHR outperforms its competitors ,and achieves the accurate results .