作者: Fan Deng , Davood Rafiei
关键词:
摘要: Traditional duplicate elimination techniques are not applicable to many data stream applications. In general, precisely eliminating duplicates in an unbounded is feasible streaming scenarios. Therefore, we target at approximately environments given a limited space. Based on well-known bitmap sketch, introduce structure, Stable Bloom Filter, and novel simple algorithm. The basic idea as follows: since there no way store the whole history of stream, SBF continuously evicts stale information so that has room for those more recent elements. After finding some properties analytically, show tight upper bound false positive rates guaranteed. our empirical study, compare alternative methods. results method superior terms both accuracy time effciency when fixed small space acceptable rate given.