Abstract
With the rapid accumulation of data in various types, modern database systems are facing the problem of managing multidimensional data. The main challenge is to design a highly efficient storage mechanism which can support fast item lookup with exact membership queries or partial information membership queries. This paper presents a novel data structure called Cartesian-join of Bloom Filters. The method maintains a matrix that stores the Cartesian product of attribute bloom filters, each of which represents one dimension of the dataset. Experiments show that the proposed approach can not only achieve the same false positive rate as the traditional bloom filter with the same size, but also have an advantageous feature of by-attribute membership query. The data structure uses only ten bits to store a four-dimensional item and the average false rate for a query is one percent. The algorithm is robust even if it goes through high-correlated queries. Copyright © 2014 by The Institute of Electrical and Electronics Engineers, Inc. All rights reserved.
Original language | English |
---|---|
Title of host publication | Proceedings of 2014 IEEE International Congress on Big Data, BigData Congress 2014 |
Editors | Peter CHEN, Hemant JAIN |
Place of Publication | Danvers, MA |
Publisher | IEEE |
Pages | 288-295 |
ISBN (Electronic) | 9781479950577 |
DOIs | |
Publication status | Published - Sept 2014 |