Optimizing Queries on Compressed Bitmaps

作者: Sihem Amer-Yahia , Theodore Johnson

DOI:

关键词:

摘要: OptimizingQueriesOnCompressedBitmapsSihem Amer-YahiaAT&T Labs{Researchsihem@research.att.comTheo doreJohnsonjohnsont@research.att.comAbstractBitmap indices are used by DBMS's to accelerate decision supp ort queries.A signi cant advantage ofbitmap is that complex logical selection op erations can b e p erformed very quickly, erformingbit-wiseAND,OR,andNOTop erators.Althoughbitmapindicescanb espaceinecientforhighcardinalityattributes,the space use of compressed bitmapscompares well other indexingmetho ds.Oracle and Sybase IQ two commercial pro ducts make extensive bitmap indices.Our recent research showed there several fast algorithmsfor evaluatingBo oleanop eratorson compressedbitmaps.Dep endingon the natureof erandbitmaps(theirformat, densityandclusterdness) eration (AND, NOT, ...), these algorithms have executiontimes orders magnitude di erent.Cho osing an algorithm for erforming a Bo olean erationhas global ects in query expression, requiring optimization.We present linear timedynamicprogrammingsearch strategy based on cost mo delto optimizequeryexpressionevaluationplans.We alsopresentrewritingheuristicsthat rewritethe queryexpressionto anequivalenonetoencourage etter algorithmsassignments.Our erformance results show optimizerrequiresanegligibl amount time execute, optimized queries execute up three timesfaster than unoptimized real data.1Intro ductionAbitmap indexis bit string which each mapp ed record ID (RID) relation.A thebitmap index set (to 1) if corresp onding RID has prop ertyP(i.e., represents customer thatlives New York), reset 0) otherwise.In typical usage, predicatePis true it hasthe valueafor attributeA.One such predicate asso ciated one unique value ofthe attributeA.The predicates more complex, example bitslice [OQ97] precomputedcomplex [HEP99].Oneadvantageofbitmapindicesisthatcomplexselectionpredicatescanb ecomputedveryquickly,by bit-wiseAND, OR, andNOTop indices.Furthermore, indexableselection involve many attributes.Let's consider some examples, using databasewith schemaCustomer(Name, Livesin, Worksin, Car, Numberofchildren, Hascable, Hasel lular)Supp osethatweanttoselectallcustomerswholivinNewEngland.Thentheselectioncon-ditionisLivesin=\ME"ORin=\VT"in=\NH"in=\MA"in=\CT"ORLivesin=\RI"in=\NY". Since createdfor attributeLivesin, translates into mapping attribute all its ossible values.1

参考文章(21)
Gennady Antoshenkov, Byte aligned data compression ,(1993)
Theodore Johnson, Performance Measurements of Compressed Bitmap Indices very large data bases. pp. 278- 289 ,(1999)
G. Antoshenkov, Byte-aligned bitmap compression data compression conference. pp. 476- 476 ,(1995) , 10.1109/DCC.1995.515586
A. Shoshani, L.M. Bernardo, H. Nordberg, D. Rotem, A. Sim, Multidimensional indexing and query coordination for tertiary storage management statistical and scientific database management. pp. 214- 225 ,(1999) , 10.1109/SSDM.1999.787637
M. Schaller, Reclustering of high energy physics data statistical and scientific database management. pp. 194- 203 ,(1999) , 10.1109/SSDM.1999.787635
Y. E. Ioannidis, Younkyung Kang, Randomized algorithms for optimizing large join queries international conference on management of data. ,vol. 19, pp. 312- 321 ,(1990) , 10.1145/93597.98740
Chee-Yong Chan, Yannis E. Ioannidis, Bitmap index design and evaluation Proceedings of the 1998 ACM SIGMOD international conference on Management of data - SIGMOD '98. ,vol. 27, pp. 355- 366 ,(1998) , 10.1145/276304.276336
Toshihide Ibaraki, Tiko Kameda, On the optimal nesting order for computing N-relational joins ACM Transactions on Database Systems. ,vol. 9, pp. 482- 502 ,(1984) , 10.1145/1270.1498
Yannis E. Ioannidis, Eugene Wong, Query optimization by simulated annealing international conference on management of data. ,vol. 16, pp. 9- 22 ,(1987) , 10.1145/38713.38722
Patrick O'Neil, Dallan Quass, Improved query performance with variant indexes international conference on management of data. ,vol. 26, pp. 38- 49 ,(1997) , 10.1145/253260.253268