Statistical Estimation for Single Linkage Hierarchical Clustering

作者: Dant Wang

DOI:

关键词:

摘要: Clustering is a key component of most detectors cyber-attacks , and increasingly, for both theoretical practical reasons, methods that produce hierarchical clusterings (dendrograms) are being deployed in this context. In particular Single Linkage Hierarchical (SLHC) attracting con­ siderable interest. Existing clustering algorithms take no account uncertainties the data. paper, we derive statistical model estimation dendrograms, taking into uncertainty (through noise or corruption) distances among data points. We focus on just hierarchy partitions afforded by dendrogram, rather than heights latter. The concept estimating "dendrogram structure" under SLHC introduced, an approximate maximum likelihood estimator (MLE) dendrogram structure described. proposed demonstrated simple Monte Carlo simulation which demonstrates MLE method performs better obtaining correct structure. I. INTRODUCTION Fast detection critical aspect cyber- security, many implemented potential available purpose. underpinning technology attack algorithms; these play important role where learning analysis have to be performed unsupervised fashion. Specifically,

参考文章(29)
J. C. Gower, G. J. S. Ross, Minimum Spanning Trees and Single Linkage Cluster Analysis Journal of The Royal Statistical Society Series C-applied Statistics. ,vol. 18, pp. 54- 64 ,(1969) , 10.2307/2346439
Jelena Mirkovic, Songjie Wei, Ezra Kissel, Profiling and Clustering Internet Hosts. DMIN. pp. 269- 275 ,(2006)
Engin Kirda, H Van Tilborg, S Jajodia, Malware Behavior Clustering. Encyclopedia of Cryptography and Security (2nd Ed.). pp. 751- 752 ,(2011)
D. J. C. Mackay, Introduction to Monte Carlo methods Proceedings of the NATO Advanced Study Institute on Learning in graphical models. pp. 175- 204 ,(1998) , 10.1007/978-94-011-5014-9_7
Steve Hanna, Ling Huang, Edward Wu, Saung Li, Charles Chen, Dawn Song, Juxtapp: a scalable system for detecting code reuse among android applications international conference on detection of intrusions and malware and vulnerability assessment. pp. 62- 81 ,(2012) , 10.1007/978-3-642-37300-8_4
Haitao Du, Shanchieh Jay Yang, Discovering collaborative cyber attack patterns using social network analysis international conference on social computing. ,vol. 6589, pp. 129- 136 ,(2011) , 10.1007/978-3-642-19656-0_20
George Karypis, Michael Steinbach, Vipin Kumar, A Comparison of Document Clustering Techniques ,(2000)
Facundo Memoli, Gunnar Carlsson, Persistent Clustering and a Theorem of J. Kleinberg arXiv: Machine Learning. ,(2008)
Gunnar Carlsson, Facundo Mémoli, Classifying Clustering Schemes Foundations of Computational Mathematics. ,vol. 13, pp. 221- 252 ,(2013) , 10.1007/S10208-012-9141-9
Roberto Perdisci, Guofei Gu, Wenke Lee, Junjie Zhang, BotMiner: clustering analysis of network traffic for protocol- and structure-independent botnet detection usenix security symposium. pp. 139- 154 ,(2008)