作者: Wen Xiong , Zhibin Yu , Zhendong Bei , Juanjuan Zhao , Fan Zhang
DOI: 10.1109/BIGDATA.2013.6691707
关键词:
摘要: Recently, big data has been evolved into a buzzword from academia to industry all over the world. Benchmarks are important tools for evaluating an IT system. However, benchmarking systems is much more challenging than ever before. First, still in their infant stage and consequently they not well understood. Second, complicated compared previous such as single node computing platform. While some researchers started design benchmarks systems, do consider redundancy between benchmarks. Moreover, use artificial input sets rather real world It therefore unclear whether these can be used precisely evaluate performance of systems. In this paper, we first analyze among ICTBench, HiBench typical workloads applications: spatio-temporal analysis Shenzhen transportation Subsequently, present initial idea benchmark suite data. There three findings work: (1) exists pioneering suites them removed safely. (2) The workload behavior trajectory applications dramatically affected by sets. (3) created academic research cannot represent cases applications.