分析了现有多维布鲁姆过滤器查询算法的工作原理和特点,针对大数据处理特点提出了一种基于双射函数的高精度多维计数布鲁姆过滤器(AMD-CBF)查询算法.AMD-CBF中元素表示和查找分两步进行,第1步将元素各属性哈希映射到各自对应的高精度计数布鲁姆过滤器(A-CBF)中;第2步将元素的所有属性通过双射函数转换为一个值来表示元素整体信息,然后将这个值哈希映射到联合计数布鲁姆过滤器中(C-CBF),完成元素整体的表示和查询确认.理论分析和仿真实验结果表明,AMD-CBF能够支持多维集合元素的高效表示和查询及删除,相比同类研究查询假阳性降低明显,查询精度大幅度提高.
Based on the analysis of multi-dimension Bloom filter presented before, an accurate multi-dimension counting Bloom filter(AMD-CBF) query algorithm based on bijective function is proposed for big data processing. When representing or querying an element,AMD-CBF needs two steps. The first step is to hash and map each attribute of the element to their corresponding accurate counting Bloom filter (A-CBF) ; The second step is to transform all attributes of the element into a value by bijective function to represent the overall information of the element,then the value is hashed and mapped into a combined counting Bloom filter (C-CBF) for completing the representation and query confirmation of the elements overall. Both theoretical analysis and experiment show that the AMD-CBF can support concise representation, approximate membership query and deletion of multi-dimension data set and significantly lower false positive rate and improve query accuracy compared to similar research.