医疗数据发布时对患者隐私保护是实际应用中要解决的关键问题之一,作为医疗机构,希望发布的医疗数据可以在保证保护个人隐私的同时,使数据尽可能不失真,使发布的数据具有较高的可用性.文中提出了一种新的医疗数据发布中多敏感属性隐私保护(AHPK-匿名)算法.算法在现有K-匿名算法的基础上考虑不同的准标识属性对敏感属性的效用,利用层次分析法计算准标识属性对敏感属性的效用权值,再根据权值对准标识属性进行概化处理.理论分析和实验结果表明,AHPK-匿名算法能较好地保护个人隐私,能有效保持发布后数据的可用性.
The protection of patients' privacy in medical data publishing is one of the key problems to be solved in practical applications. As medical institutions, medical data need to protect the privacy of personal data as far as possible not distortion, which let the released data with high usability. This paper proposes a new privacy protecting algorithm for multiple sensitive properties in medical data publishing i.e. the AHPK-Anonymous algorithm, which considers the effect of different Quasi-Identifier upon sen- sible properties based on the K-Anonymous algorithm, calculate the effect power of Quasi-Identifier upon sensible properties using analytic hierarchy process, and do the hiding process to Quasi-Identifier according to the power value. Both theoretical analysis and experimental results on real datasets show that the AHPK-Anonymous Algorithm can protect individual's privacy better and keep the availability of da- ta effectively after publishing.