作者: Anaël Beaugnon , Pierre Chifflier , Francis Bach
DOI: 10.1007/978-3-319-66332-6_6
关键词: Open source 、 Annotation 、 Intrusion detection system 、 Artificial intelligence 、 Labelling 、 Workload 、 NetFlow 、 Machine learning 、 Active learning (machine learning) 、 Scalability 、 Computer science
摘要: Acquiring a representative labelled dataset is hurdle that has to be overcome learn supervised detection model. Labelling particularly expensive in computer security as expert knowledge required perform the annotations. In this paper, we introduce ILAB, novel interactive labelling strategy helps experts label large datasets for intrusion with reduced workload. First, compare ILAB two state-of-the-art strategies on public and demonstrate it both an effective scalable solution. Second, show workable real-world annotation project carried out unlabelled NetFlow originating from production environment. We provide open source implementation (https://github.com/ANSSI-FR/SecuML/) allow their own researchers strategies.