On Runtime and Classification Performance of the Discretize-Optimize (DISCO) Classification Approach

作者: Johan Garcia , Topi Korhonen

DOI: 10.1145/3308897.3308965

关键词:

摘要: Using machine learning in high-speed networks for tasks such as flow classification typically requires either very resource efficient approaches, large amounts of computational resources, or specialized hardware. Here we provide a sketch the discretize-optimize (DISCO) approach which can construct an extremely classifier low dimensional problems by combining feature selection, discretization, novel bin placement, and lookup. As selection discretization parameters are crucial, appropriate combinatorial optimization is important aspect approach. A performance evaluation performed YouTube task using cellular traffic data set. The initial results show that DISCO move Pareto boundary versus runtime trade-off up to order magnitude compared optimized random forest decision tree classifiers.

参考文章(8)
Raouf Boutaba, Mohammad A. Salahuddin, Noura Limam, Sara Ayoubi, Nashid Shahriar, Felipe Estrada-Solano, Oscar M. Caicedo, A comprehensive survey on machine learning for networking: evolution, applications and research opportunities Journal of Internet Services and Applications. ,vol. 9, pp. 1- 99 ,(2018) , 10.1186/S13174-018-0087-2
Johan Garcia, Topi Korhonen, Ricky Andersson, Filip Vastlund, Towards Video Flow Classification at a Million Encrypted Flows Per Second 2018 IEEE 32nd International Conference on Advanced Information Networking and Applications (AINA). pp. 358- 365 ,(2018) , 10.1109/AINA.2018.00061
Johan Garcia, Topi Korhonen, Efficient Distribution-Derived Features for High-Speed Encrypted Flow Classification acm special interest group on data communication. pp. 21- 27 ,(2018) , 10.1145/3229543.3229548
ČermákMilan, DrašarMartin, ČeledaPavel, VelanPetr, A survey of methods for encrypted traffic classification and analysis Networks. ,(2015) , 10.5555/2885167.2885173
Tristan Groléat, Sandrine Vaton, Matthieu Arzel, High-speed flow-based classification on FPGA International Journal of Network Management. ,vol. 24, pp. 253- 271 ,(2014) , 10.1002/NEM.1863
Salvador Garcia, J. Luengo, José Antonio Sáez, Victoria López, F. Herrera, A Survey of Discretization Techniques: Taxonomy and Empirical Analysis in Supervised Learning IEEE Transactions on Knowledge and Data Engineering. ,vol. 25, pp. 734- 750 ,(2013) , 10.1109/TKDE.2012.35
Laurent Bernaille, Renata Teixeira, Ismael Akodkenou, Augustin Soule, Kave Salamatian, Traffic classification on the fly ACM SIGCOMM Computer Communication Review. ,vol. 36, pp. 23- 26 ,(2006) , 10.1145/1129582.1129589
Huan Liu, Farhad Hussain, Chew Lim Tan, Manoranjan Dash, Discretization: An Enabling Technique Data Mining and Knowledge Discovery. ,vol. 6, pp. 393- 423 ,(2002) , 10.1023/A:1016304305535