A Machine Learning Based Approach for Deepfake Detection in Social Media Through Key Video Frame Extraction

作者: Elias Kougianos , Peter Corcoran , Saraju P. Mohanty , Alakananda Mitra

DOI: 10.1007/S42979-021-00495-X

关键词: Social mediaComputer visionAutoencoderComputer scienceFeature vectorKey (cryptography)Convolutional neural networkDeep learningFrame (networking)Visual artifactArtificial intelligence

摘要: In the last few years, with advent of deepfake videos, image forgery has become a serious threat. video, person’s face, emotion or speech are replaced by someone else’s different speech, using deep learning technology. These videos often so sophisticated that traces manipulation difficult to detect. They can have heavy social, political and emotional impact on individuals, as well society. Social media most common targets they vulnerable platforms, susceptible blackmailing defaming person. There some existing works for detecting but very attempts been made in social media. The first step preempt such misleading from is detect them. Our paper presents novel neural network-based method fake videos. We applied key video frame extraction technique reduce computation A model, consisting convolutional network (CNN) classifier network, proposed along algorithm. Xception net chosen over two other structures—InceptionV3 Resnet50—for pairing our classifier. model visual artifact-based detection technique. feature vectors CNN module used input subsequent classifying video. FaceForensics++ Deepfake Detection Challenge datasets reach best model. detects highly compressed high accuracy lowered computational requirements. achieved 98.5% dataset 92.33% combined Challenge. Any autoencoder generated be detected almost all if possess more than one frame. reported here when number frames one. simplicity will help people check authenticity work focused, not limited, addressing economical issues due this paper, we achieve without training an enormous amount data. reduces computations significantly, compared works.

参考文章(49)
Karen Simonyan, Andrew Zisserman, Very Deep Convolutional Networks for Large-Scale Image Recognition computer vision and pattern recognition. ,(2014)
A. Gironi, M. Fontani, T. Bianchi, A. Piva, M. Barni, A VIDEO FORENSIC TECHNIQUE FOR DETECTING FRAME DELETION AND INSERTION international conference on acoustics, speech, and signal processing. pp. 6226- 6230 ,(2014) , 10.1109/ICASSP.2014.6854801
Justus Thies, Michael Zollhöfer, Matthias Nießner, Levi Valgaerts, Marc Stamminger, Christian Theobalt, Real-time expression transfer for facial reenactment international conference on computer graphics and interactive techniques. ,vol. 34, pp. 183- ,(2015) , 10.1145/2816795.2818056
Thomas Gloe, Rainer Böhme, The 'Dresden Image Database' for benchmarking digital image forensics Proceedings of the 2010 ACM Symposium on Applied Computing - SAC '10. pp. 1584- 1590 ,(2010) , 10.1145/1774088.1774427
Hui-Tzu Grace Chou, Nicholas Edge, “They Are Happier and Having Better Lives than I Am”: The Impact of Using Facebook on Perceptions of Others' Lives Cyberpsychology, Behavior, and Social Networking. ,vol. 15, pp. 117- 121 ,(2012) , 10.1089/CYBER.2011.0324
Pablo Garrido, Levi Valgaerts, Ole Rehmsen, Thorsten Thormaehlen, Patrick Perez, Christian Theobalt, Automatic Face Reenactment computer vision and pattern recognition. pp. 4217- 4224 ,(2014) , 10.1109/CVPR.2014.537
Ilya Sutskever, Geoffrey Hinton, Alex Krizhevsky, Ruslan Salakhutdinov, Nitish Srivastava, Dropout: a simple way to prevent neural networks from overfitting Journal of Machine Learning Research. ,vol. 15, pp. 1929- 1958 ,(2014)
Christoph Bregler, Michele Covell, Malcolm Slaney, Video Rewrite: driving visual speech with audio international conference on computer graphics and interactive techniques. pp. 353- 360 ,(1997) , 10.1145/258734.258880
I. Amerini, L. Ballan, R. Caldelli, A. Del Bimbo, G. Serra, A SIFT-Based Forensic Method for Copy–Move Attack Detection and Transformation Recovery IEEE Transactions on Information Forensics and Security. ,vol. 6, pp. 1099- 1110 ,(2011) , 10.1109/TIFS.2011.2129512
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, Zbigniew Wojna, Rethinking the Inception Architecture for Computer Vision computer vision and pattern recognition. pp. 2818- 2826 ,(2016) , 10.1109/CVPR.2016.308