DiskAccel: Accelerating Disk-Based Experiments by Representative Sampling

作者: Mojtaba Tarihi , Hossein Asadi , Hamid Sarbazi-Azad

DOI: 10.1145/2745844.2745856

关键词:

摘要: Disk traces are typically used to analyze real-life workloads and for replay-based evaluations. This approach benefits from capturing important details such as varying behavior patterns, bursty activity, diurnal patterns of system which often missing the workload synthesis tools. However, accurate capture requires recording containing long durations difficult use evaluation. One way solving problem storage trace duration is disk simulators. While publicly available simulators can greatly accelerate experiments, they have not kept up with technological innovations in field. The variety, complexity, opaque nature hardware make it very implement alternative, replaying whole on real hardware, suffers either run-time or required manual reduction experimental time, potentially at cost reduced accuracy. On other hand, burstiness, auto-correlation, complex spatio-temporal properties known methods sampling less effective. In this paper, we present a methodology called DiskAccel efficiently select key intervals representatives replay them estimate response time workload. Our extracts variety spatial temporal features each interval uses efficient data mining techniques representative intervals. To verify proposed methodology, implemented tool capable running selective warming state an accelerated manner, emulating request causality while minimizing inter-arrival error. Based our manages speed by more than two orders magnitude, keeping average estimation error 7.6%.

参考文章(51)
Carlos Maltzahn, Kathy J. Richardson, Dirk Grunwald, Reducing the disk I/O of web proxy server caches usenix annual technical conference. pp. 17- 17 ,(1999)
Gregory R. Ganger, John S. Bucy, Steven W. Schlosser, Jiri Schindler, The DiskSim Simulation Environment Version 4.0 Reference Manual (CMU-PDL-08-101) ,(2008)
Sameh Elnikety, Dushyanth Narayanan, Austin Donnelly, Antony Rowstron, Eno Thereska, Everest: scaling down peak loads through I/O off-loading operating systems design and implementation. pp. 15- 28 ,(2008) , 10.5555/1855741.1855743
Jianyong Zhang, A. Sivasubramaniam, H. Franke, N. Gautam, Yanyong Zhang, S. Nagar, Synthesizing Representative I/O Workloads for TPC-H high-performance computer architecture. ,vol. 10, pp. 142- 151 ,(2004) , 10.1109/HPCA.2004.10019
Ram Swaminathan, Mustafa Uysal, Eric Anderson, Mahesh Kallahalla, Buttress: A Toolkit for Flexible and High Fidelity I/O Benchmarking file and storage technologies. pp. 45- 58 ,(2004)
Bruce Jacob, Spencer Ng, David Wang, Memory Systems: Cache, DRAM, Disk ,(2007)
Gregory R. Ganger, Raja R. Sambasivan, David O'Hallaron, James Hendricks, Matthew Wachs, Julio Lopez, Michael P. Mesnier, Trace: parallel trace replay with approximate causal events file and storage technologies. pp. 24- 24 ,(2007)
Adam Wierman, Bianca Schroeder, Mor Harchol-Balter, Open versus closed: a cautionary tale networked systems design and implementation. pp. 18- 18 ,(2006) , 10.1184/R1/6608078.V1
Swaroop Kavalanekar, Dushyanth Narayanan, Sriram Sankar, Eno Thereska, Kushagra Vaid, Bruce Worthington, Measuring Database Performance in Online Services: A Trace-Based Approach Lecture Notes in Computer Science. ,vol. 5895, pp. 132- 145 ,(2009) , 10.1007/978-3-642-10424-4_10