Not made for each other- Audio-Visual Dissonance-based Deepfake Detection and Localization

作者: Komal Chugh , Parul Gupta , Abhinav Dhall , Ramanathan Subramanian

DOI: 10.1145/3394171.3413700

关键词: ModalitiesArtificial neural networkCognitive dissonanceSpeech recognitionComputer scienceAudio visualSimilarity (psychology)Modality (human–computer interaction)Discriminative model

摘要: … (acquired via EEG and eye-gaze sensing) to assist with the detection of visual inconsistencies in addition to content analysis adopted in this work; eye-gaze and EEG have been found …

参考文章(23)
K. L. Bhanu Moorthy, Moneish Kumar, Ramanathan Subramanian, Vineet Gandhi, GAZED Gaze-guided Cinematic Editing of Wide-Angle Monocular Video Recordings human factors in computing systems. pp. 1- 11 ,(2020) , 10.1145/3313831.3376544
Trisha Mittal, Uttaran Bhattacharya, Rohan Chandra, Aniket Bera, Dinesh Manocha, Emotions Don't Lie: An Audio-Visual Deepfake Detection Method using Affective Cues acm multimedia. pp. 2823- 2832 ,(2020) , 10.1145/3394171.3413570
Vineet Gandhi, Moneish Kumar, K L Bhanu Moorthy, Ramanathan Subramaniam, GAZED- Gaze-guided Cinematic Editing of Wide-Angle Monocular Video Recordings arXiv: Computer Vision and Pattern Recognition. ,(2020) , 10.1145/3313831.3376544
Maneesh Bilalpur, Mohan Kankanhalli, Stefan Winkler, Ramanathan Subramanian, EEG-based Evaluation of Cognitive Workload Induced by Acoustic Parameters for Data Sonification international conference on multimodal interfaces. pp. 315- 323 ,(2018) , 10.1145/3242969.3243016
Conrad Sanderson, Brian C. Lovell, Multi-Region Probabilistic Histograms for Robust and Scalable Identity Inference international conference on biometrics. ,vol. 5558, pp. 199- 208 ,(2009) , 10.1007/978-3-642-01793-3_21
Nelson Mogran, Hervé Bourlard, Hynek Hermansky, Automatic Speech Recognition: an Auditory Perspective Speech Processing in the Auditory System. pp. 309- 338 ,(2004) , 10.1007/0-387-21575-1_6
Jorge Martinez, Hector Perez, Enrique Escamilla, Masahisa Mabo Suzuki, Speaker recognition using Mel frequency Cepstral Coefficients (MFCC) and Vector quantization (VQ) techniques international conference on electronics, communications, and computers. pp. 248- 251 ,(2012) , 10.1109/CONIELECOMP.2012.6189918
Subramanian Ramanathan, Harish Katti, Raymond Huang, Tat-Seng Chua, Mohan Kankanhalli, Automated localization of affective objects and actions in images via caption text-cum-eye gaze analysis acm multimedia. pp. 729- 732 ,(2009) , 10.1145/1631272.1631399
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, Li Fei-Fei, ImageNet: A large-scale hierarchical image database computer vision and pattern recognition. pp. 248- 255 ,(2009) , 10.1109/CVPR.2009.5206848