作者: Daniel Lemire , Owen Kaser
DOI: 10.1016/J.INS.2011.02.002
关键词:
摘要: Column-oriented indexes-such as projection or bitmap indexes-are compressed by run-length encoding to reduce storage and increase speed. Sorting the tables improves compression. On realistic data sets, permuting columns in right order before sorting can number of runs a factor two more. Unfortunately, determining best column is NP-hard. For many cases, we prove that table minimized if sort increasing cardinality. Experimentally, based on Hilbert space-filling curves poor at minimizing runs.