作者: Nasser S. Alamri , William H. Allen
DOI: 10.1109/SECON.2015.7132993
关键词: Data mining 、 Digital forensics 、 Artificial intelligence 、 Feature extraction 、 Machine learning 、 Work (electrical) 、 Computer science 、 File format 、 Identification (information)
摘要: Research in file-type identification has employed a number of different approaches to classify unknown files according their actual file type. However, due the lack implementation details much published research and use private datasets for many those projects, it is often not possible compare new techniques with prior work. In this paper, we present comparison five common approaches, along parameters used perform comparisons. All were evaluated same dataset which was drawn from public or widely-available sources. Our results show that each approach can produce good 88% 97% classification rates, but achieving these requires “tuning” inputs classifiers.