作者: Jane Jovanovski , Maja Siljanoska , Goran Velinov
DOI: 10.1007/978-3-642-37213-1_50
关键词: Data compression 、 Column (database) 、 Similarity (geometry) 、 Genetic algorithm 、 Sorting 、 Heuristic (computer science) 、 Table (database) 、 Mathematics 、 Algorithm 、 Data warehouse
摘要: Column-oriented database systems, usually referred to as column stores, organize data in a column-wise manner. Column-wise can be compressed efficiently, improving the performance of large read-mostly repositories such warehouses. Many compression algorithms exploit similarity among values, where repeats same value form columnar runs. In this paper we present genetic algorithm for determining an optimal sorting order which will minimize number runs store table and therefore maximize RLE-based compression. Experiments show that performs consistently well on synthetic instances realistic datasets, resulting with higher run-reduction efficiency compared existing heuristic solving given problem.