作者: Swarup Acharya , Phillip B. Gibbons , Viswanath Poosala , Sridhar Ramaswamy
关键词:
摘要: In large data warehousing environments, it is often advantageous to provide fast, approximate answers complex aggregate queries based on statistical summaries of the full data. this paper, we demonstrate difficulty providing good for join-queries using only statistics (in particular, samples) from base relations. We propose join synopses as an effective solution problem and show how precomputing just one synopsis each relation suffices significantly improve quality arbitrary with foreign key joins. present optimal strategies allocating available space among various when query work load known identify heuristics common case not known. also efficient algorithms incrementally maintaining in presence updates Our extensive set experiments TPC-D benchmark database effectiveness other techniques proposed paper.