Network Anomaly Detection Based on Semi-supervised Clustering

作者: Tian Shengfeng , Wei Xiaotao , Huang Houkuan

DOI:

关键词: Cluster analysisIntrusion detection systemDetection rateComputer sciencePattern recognitionFalse positive rateSemi supervised clusteringProcess (computing)GridArtificial intelligenceAnomaly detection

摘要: A semi-supervised clustering algorithm based on the traditional k-means is proposed for network anomaly detection. We improve original mainly in three aspects. First, number of clusters automatically decided by merging and splitting clusters. Second, a small portion labeled samples are employed to supervise process stage. Also, we modify directly symbolic attribute values. Experimental result KDD 99 intrusion detection datasets shows that our has high rate while maintaining low false positive rate. Key-Words: Network detection, Semi-supervised clustering, Grid-based K-means

参考文章(10)
Jie Bai, Yu Wu, Guoyin Wang, Simon X. Yang, Wenbin Qiu, A Novel Intrusion Detection Model Based on Multi-layer Self-Organizing Maps and Principal Component Analysis Advances in Neural Networks - ISNN 2006. pp. 255- 260 ,(2006) , 10.1007/11760191_37
Richard R. Muntz, Jiong Yang, Wei Wang, STING: A Statistical Information Grid Approach to Spatial Data Mining very large data bases. pp. 186- 195 ,(1997)
Raymond T. Ng, Jiawei Han, Efficient and Effective Clustering Methods for Spatial Data Mining very large data bases. pp. 144- 155 ,(1994)
Ali A. Ghorbani, Iosif-Viorel Onut, Y-Means: an autonomous clustering algorithm hybrid artificial intelligence systems. ,vol. 6076, pp. 1- 13 ,(2010) , 10.1007/978-3-642-13769-3_1
M. Halkidi, M. Vazirgiannis, Clustering validity assessment: finding the optimal partitioning of a data set international conference on data mining. pp. 187- 194 ,(2001) , 10.1109/ICDM.2001.989517
Tom Chiu, DongPing Fang, John Chen, Yao Wang, Christopher Jeris, A robust and scalable clustering algorithm for mixed type attributes in large database environment knowledge discovery and data mining. pp. 263- 268 ,(2001) , 10.1145/502512.502549
Arindam Banerjee, Raymond J. Mooney, Sugato Basu, Semi-supervised Clustering by Seeding international conference on machine learning. pp. 27- 34 ,(2002)
H.G. Kayacik, A.N. Zincir-Heywood, M.I. Heywood, On the capability of an SOM based intrusion detection system international joint conference on neural network. ,vol. 3, pp. 1808- 1813 ,(2003) , 10.1109/IJCNN.2003.1223682
Zhexue Huang, Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values Data Mining and Knowledge Discovery. ,vol. 2, pp. 283- 304 ,(1998) , 10.1023/A:1009769707641
Zhiwen Yu, Hau-San Wong, GCA: A real-time grid-based clustering algorithm for large data set international conference on pattern recognition. ,vol. 2, pp. 740- 743 ,(2006) , 10.1109/ICPR.2006.597