提出新的数据结构ESBF(Extensible and Scalable Bloom Filter)一可扩展的Bloom Filter.并提出基于ESBF的数据流中频繁项近似挖掘算法,该算法在保证较高精度的同时,实现比同类算法具有更好的时间效率且在一般情况下具更好的空间效率,并证明只需ln(-M/lnρ)·e/ε·1/ε·M个计数器就能保证满足用户规定的误差ε及可信度ρ要求。
A new data structure-ESBF(extensible and scalable Bloom Filter) is introduced here and a ESBF-based algorithm is also proposed for estimating the frequent items in data streams approximatly. The proposed algorithm can work with high precision and it is more efficient in terms of time and memory consuming than the other algorithms dealing with the frequent itemmining in data streams in most cases. It is also proved here that the number of counter needed is only ln(-M/lnρ)·e/ε·1/ε·M for required precision and probability.