The file fragment classification problem : a combined neural network and linear programming discriminant model approach

作者: Erich Feodor Wilgenbus

DOI:

关键词:

摘要: The increased use of digital media to store legal, as well illegal data, has created the need for specialized tools that can monitor, control and even recover this data. An important task in computer forensics security is identify true file type which a or fragment belongs. File identification traditionally done by means metadata, such extensions header footer signatures. As result, traditional metadata-based object techniques work cases where required metadata available unaltered. However, approaches are not reliable when integrity guaranteed unavailable. an alternative, any pattern content be used determine associated type. This called content-based identification. Supervised learning infer classifier exploiting some unique underlies type’s common structure. study builds on existing literature regarding supervised identification, explores combined multilayer perceptron neural network classifiers linear programming-based discriminant solution multiple class problem. purpose was investigate compare single classifier, ensemble these field ability each individual accurately predict belongs were tested empirically. found both (used round robin) seemed perform solving results combining better than those optimized classifiers.

参考文章(27)
Christopher M. Bishop, Neural networks for pattern recognition ,(1995)
Sotiris B. Kotsiantis, Supervised Machine Learning: A Review of Classification Techniques Informatica (lithuanian Academy of Sciences). ,vol. 31, pp. 249- 268 ,(2007)
Ding Cao, Junyong Luo, Meijuan Yin, Huijie Yang, Feature selection based file type identification algorithm international conference on intelligent computing. ,vol. 3, pp. 58- 62 ,(2010) , 10.1109/ICICISYS.2010.5658559
Kyung-suk Lhee, ManPyo Hong, Irfan Ahmed, Hyunjung Shin, Content-based File-type Identification Using Cosine Similarity and a Divide-and-Conquer Approach Iete Technical Review. ,vol. 27, pp. 465- 477 ,(2010) , 10.4103/02564602.2010.10876780
Mehdi Chehel Amirani, Mohsen Toorani, A. Beheshti, A new approach to content-based file type detection international symposium on computers and communications. pp. 1103- 1108 ,(2008) , 10.1109/ISCC.2008.4625611
Kim Fung Lame, Jane W. Moy, An experimental comparison of some recently developed linear programming approaches to the discriminant problem Computers & Operations Research. ,vol. 24, pp. 593- 599 ,(1997) , 10.1016/S0305-0548(96)00087-1
M. McDaniel, M.H. Heydari, Content based file type detection algorithms hawaii international conference on system sciences. ,vol. 10, pp. 332- ,(2003) , 10.1109/HICSS.2003.1174905
Simson Garfinkel, Paul Farrell, Vassil Roussev, George Dinolt, Bringing science to digital forensics with standardized forensic corpora Digital Investigation. ,vol. 6, ,(2009) , 10.1016/J.DIIN.2009.06.016