加密数据是保护用户隐私的一个方法,特别在开放系统中的数据处理需求更为迫切,但要解决如何在密文上进行检索的问题。针对SSE-1密文检索方案的一些性能缺陷,采用不同的加密策略,在lucene倒排索引的基础上,设计了密文倒排索引Crypt-Lucene,同时结合云计算特点,设计了并行构建Crypt-Lucene方案,理论分析了方案的性能,并通过实验证明了方法的有效性。实验结果表明,Crypt-Lucene与SSE-1相比,索引构建时间减少了约为60%,同时具有较好的空间性能,对于大文档集合,利用MapReduce在4结点构成的Hadoop集群上并行构建8个Crypt-Lucene索引能减少83.4%的时间。
Encrypting data is a method to protect customer’s privacy, especially in the open system, but it becomes aproblem how to do query on encrypted data. In view of some low performance of the existing SSE-1 scheme, it uses differentencryption strategies to design a crypt inverted index(Crypt-Lucene)based on lucene. In addition, a scheme forbuilding Crypt-Lucene parallelly is proposed based on MapReduce. The performance of the scheme is analyzed in the theory,and then experiments are conducted to demonstrate the efficiency of the design. The experimental results show that it canreduce 60% time to build index with Crypt-Lucene compared with SSE-1, and it also gets a good space performance. It isobserved that building 8 Crypt-Lucene for large document collections with MapReduce on the Hadoop cluster consistingof four nodes can reduce 83.4% time.