作者: Gheorghi Guzun , Guadalupe Canahuate , David Chiu , Jason Sawin
DOI: 10.1109/ICDE.2014.6816675
关键词:
摘要: Bitmap indices are widely used for large read-only repositories in data warehouses and scientific databases. Their binary representation allows the use of bitwise operations specialized run-length compression techniques. Due to a trade-off between query efficiency, bitmap schemes aligned using fixed encoding length size (typically word length) avoid explicit decompression during time. In general, smaller lengths provide better compression, but require more decoding execution. However, when difference is considerable, it possible encodings also execution We posit that tailored each bit vector will performance than one-size-fits-all approach. present framework optimizes efficiency by allowing bitmaps be compressed variable while still maintaining alignment decompression. Efficient algorithms introduced process queries over different lengths. An input parameter controls aggressiveness providing user with ability tune tradeoff space Our empirical study shows this approach achieves significant improvements terms both time ratio synthetic real sets. Compared 32-bit WAH, VAL-WAH produces up 1.8× times 30% faster.