作者: Yongjoo Park , Srikanth Kandula , Surajit Chaudhury , Barzan Mozafari
DOI:
关键词:
摘要: 4. DISCUSSION SUMMARYIndustrial participants mostly brought up the chickenand-egg problem: without approximation support being available in a data platform, it is anybody’s guess as to what barriers may stall user adoption barriers but, on the other hand, without a clear case for value it is somewhat unreasonable to expect data platforms to invest the substantial engineering resources needed to make approximation support available. Other concerns that were discussed included an open discussion on user penchant to accept an approximate answer:(1) scenarios where errors are universally accepted as being vanishingly small are likely to be quickly adopted (eg, as with the case of using the hyperloglog sketch to estimate COUNT DISTINCT, where prior published works express confidence that errors will be small in many scenarios) and (2) scenarios where user’s can reason about and be comfortable with the error model offered by the approximation method.Academic participants brought up particular use-cases such as data-cleaning and distributed model training where approximation can play a key role (eg, keynotes from Joe and Chris). There was also substantial reflection regarding the key technical advances that may warrant re-examining the space of approximate analytics (eg, Florin Rusu questioning).