Fast and Accurate Graph Stream Summarization

作者: Xiangyang Gou , Lei Zou , Chenxingyu Zhao , Tong Yang

DOI: 10.1109/ICDE.2019.00103

关键词:

摘要: A graph stream is a continuous sequence of data items, in which each item indicates an edge, including its two endpoints and edge weight. It forms dynamic that changes with every item. Graph streams play important roles cyber security, social networks, cloud troubleshooting systems more. Due to the vast volume high update speed streams, traditional structures for storage such as adjacency matrix list are no longer sufficient. However, prior art summarization, like CM sketches, gSketches, TCM gMatrix, either supports limited kinds queries or suffers from poor accuracy query results. In this paper, we propose novel Stream Sketch (GSS short) summarize has linear space cost O(|E|) (E set graph) constant time (O(1)) most over controllable errors. Both theoretical analysis experiment results confirm superiority our solution regard time/space complexity results’ precision compared state-of-the-art.

参考文章(26)
Joseph E Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, Carlos Guestrin, None, PowerGraph: distributed graph-parallel computation on natural graphs operating systems design and implementation. pp. 17- 30 ,(2012) , 10.5555/2387880.2387883
Reynold S. Xin, Ion Stoica, Joseph E. Gonzalez, Daniel Crankshaw, Ankur Dave, Michael J. Franklin, GraphX: graph processing in a distributed dataflow framework operating systems design and implementation. pp. 599- 613 ,(2014) , 10.5555/2685048.2685096
Vladimir Braverman, Rafail Ostrovsky, Dan Vilenchik, How Hard Is Counting Triangles in the Streaming Model? Automata, Languages, and Programming. pp. 244- 254 ,(2013) , 10.1007/978-3-642-39206-1_21
Graham Cormode, S. Muthukrishnan, An improved data stream summary: The count-min sketch and its applications latin american symposium on theoretical informatics. pp. 29- 38 ,(2004) , 10.1007/978-3-540-24698-5_7
Peixiang Zhao, Charu C. Aggarwal, Min Wang, gSketch Proceedings of the VLDB Endowment. ,vol. 5, pp. 193- 204 ,(2011) , 10.14778/2078331.2078335
Daniel A. Spielman, Nikhil Srivastava, Graph Sparsification by Effective Resistances SIAM Journal on Computing. ,vol. 40, pp. 1913- 1926 ,(2011) , 10.1137/080734029
Sudipto Guha, Andrew McGregor, Graph synopses, sketches, and streams Proceedings of the VLDB Endowment. ,vol. 5, pp. 2030- 2031 ,(2012) , 10.14778/2367502.2367570
Cristian Estan, George Varghese, New directions in traffic measurement and accounting ACM Transactions on Computer Systems. ,vol. 21, pp. 270- 313 ,(2003) , 10.1145/859716.859719
Yucheng Low, Danny Bickson, Joseph Gonzalez, Carlos Guestrin, Aapo Kyrola, Joseph M. Hellerstein, Distributed GraphLab Proceedings of the VLDB Endowment. ,vol. 5, pp. 716- 727 ,(2012) , 10.14778/2212351.2212354