Flying Memcache: Lessons Learned from Different Acceleration Strategies

作者: Dimitris Deyannis , Lazaros Koromilas , Giorgos Vasiliadis , Elias Athanasopoulos , Sotiris Ioannidis

DOI: 10.1109/SBAC-PAD.2014.17

关键词: CacheAccelerationOperating systemComputer scienceCommodityImplementation

摘要: Distributed key-value and always-in-memory store is employed by large demanding services, such as Facebook Amazon. It apparent that generic implementations of caches can not meet the needs every application, therefore further research for optimizing or speeding up cache operations required. In this paper, we present an incremental optimization strategy accelerating most popular store, namely memcached. First accelerate computational unit utilizing commodity GPUs, which offer a significant performance increase on CPU-bound part memcached, but only moderate under intensive I/O. We then proceed to improve I/O replacing TCP with fast UDP implementation in user-space. Putting it all together, GPUs instead CPUs, communication TCP, are able experimentally achieve 20 Gbps line-rate, significantly outperforms original

参考文章(21)
Michalis Polychronakis, Evangelos P. Markatos, Sotiris Ioannidis, Giorgos Vasiliadis, Spiros Antonatos, Gnort: High Performance Network Intrusion Detection Using Graphics Processors recent advances in intrusion detection. pp. 116- 134 ,(2008) , 10.1007/978-3-540-87403-4_7
Patrick Stuedi, Bernard Metzler, Animesh Trivedi, Wimpy nodes with 10GbE: leveraging one-sided operations in soft-RDMA to boost memcached usenix annual technical conference. pp. 31- 31 ,(2012)
Peter Vajgel, Jason Sobel, Sanjeev Kumar, Doug Beaver, Harry C. Li, Finding a needle in Haystack: facebook's photo storage operating systems design and implementation. pp. 47- 60 ,(2010) , 10.5555/1924943.1924947
Zsolt István, Kimon Karras, Michaela Blott, Ling Liu, Kees A. Vissers, Jeremia Bär, Achieving 10Gbps Line-rate Key-value Stores with FPGAs usenix conference on hot topics in cloud ccomputing. ,(2013)
Muhammad Jamshed, KyoungSoo Park, Shinae Woo, Dongsu Han, Sunghwan Ihm, Haewon Jeong, Eun Young Jeong, mTCP: a highly scalable user-level TCP stack for multicore systems networked systems design and implementation. pp. 489- 502 ,(2014) , 10.5555/2616448.2616493
Giorgos Vasiliadis, Michalis Polychronakis, Spiros Antonatos, Evangelos P. Markatos, Sotiris Ioannidis, Regular Expression Matching on Graphics Hardware for Intrusion Detection recent advances in intrusion detection. pp. 265- 283 ,(2009) , 10.1007/978-3-642-04342-0_14
Brad Fitzpatrick, Distributed caching with memcached Linux Journal. ,vol. 2004, pp. 5- ,(2004)
Hans Fugal, Rajesh Nishtala, Mike Paleczny, Daniel Peek, Tony Tung, Harry C. Li, Marc Kwiatkowski, Paul Saab, Herman Lee, Ryan McElroy, David Stafford, Steven Grimm, Venkateshwaran Venkataramani, Scaling Memcache at Facebook networked systems design and implementation. pp. 385- 398 ,(2013)
Robert Ricci, Weibin Sun, Fast and flexible: parallel packet processing with GPUs and click architectures for networking and communications systems. pp. 25- 36 ,(2013) , 10.5555/2537857.2537861
Tayler H. Hetherington, Timothy G. Rogers, Lisa Hsu, Mike O'Connor, Tor M. Aamodt, Characterizing and evaluating a key-value store application on heterogeneous CPU-GPU systems international symposium on performance analysis of systems and software. pp. 88- 98 ,(2012) , 10.1109/ISPASS.2012.6189209