Not made for each other- Audio-Visual Dissonance-based Deepfake Detection and Localization

作者： Komal Chugh , Parul Gupta , Abhinav Dhall , Ramanathan Subramanian

关键词: Modalities 、 Artificial neural network 、 Cognitive dissonance 、 Speech recognition 、 Computer science 、 Audio visual 、 Similarity (psychology) 、 Modality (human–computer interaction) 、 Discriminative model

摘要: … (acquired via EEG and eye-gaze sensing) to assist with the detection of visual inconsistencies in addition to content analysis adopted in this work; eye-gaze and EEG have been found …

参考文章(23)

K. L. Bhanu Moorthy, Moneish Kumar, Ramanathan Subramanian, Vineet Gandhi, GAZED Gaze-guided Cinematic Editing of Wide-Angle Monocular Video Recordings human factors in computing systems. pp. 1- 11 ,(2020) , 10.1145/3313831.3376544

Trisha Mittal, Uttaran Bhattacharya, Rohan Chandra, Aniket Bera, Dinesh Manocha, Emotions Don't Lie: An Audio-Visual Deepfake Detection Method using Affective Cues acm multimedia. pp. 2823- 2832 ,(2020) , 10.1145/3394171.3413570

Vineet Gandhi, Moneish Kumar, K L Bhanu Moorthy, Ramanathan Subramaniam, GAZED- Gaze-guided Cinematic Editing of Wide-Angle Monocular Video Recordings arXiv: Computer Vision and Pattern Recognition. ,(2020) , 10.1145/3313831.3376544

Maneesh Bilalpur, Mohan Kankanhalli, Stefan Winkler, Ramanathan Subramanian, EEG-based Evaluation of Cognitive Workload Induced by Acoustic Parameters for Data Sonification international conference on multimodal interfaces. pp. 315- 323 ,(2018) , 10.1145/3242969.3243016

Conrad Sanderson, Brian C. Lovell, Multi-Region Probabilistic Histograms for Robust and Scalable Identity Inference international conference on biometrics. ,vol. 5558, pp. 199- 208 ,(2009) , 10.1007/978-3-642-01793-3_21

Nelson Mogran, Hervé Bourlard, Hynek Hermansky, Automatic Speech Recognition: an Auditory Perspective Speech Processing in the Auditory System. pp. 309- 338 ,(2004) , 10.1007/0-387-21575-1_6

TOM GRIMES, Mild auditory-visual dissonance in television news may exceed viewer attentional capacity Human Communication Research. ,vol. 18, pp. 268- 298 ,(1991) , 10.1111/J.1468-2958.1991.TB00546.X

Jorge Martinez, Hector Perez, Enrique Escamilla, Masahisa Mabo Suzuki, Speaker recognition using Mel frequency Cepstral Coefficients (MFCC) and Vector quantization (VQ) techniques international conference on electronics, communications, and computers. pp. 248- 251 ,(2012) , 10.1109/CONIELECOMP.2012.6189918

Subramanian Ramanathan, Harish Katti, Raymond Huang, Tat-Seng Chua, Mohan Kankanhalli, Automated localization of affective objects and actions in images via caption text-cum-eye gaze analysis acm multimedia. pp. 729- 732 ,(2009) , 10.1145/1631272.1631399

10.

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, Li Fei-Fei, ImageNet: A large-scale hierarchical image database computer vision and pattern recognition. pp. 248- 255 ,(2009) , 10.1109/CVPR.2009.5206848

Not made for each other- Audio-Visual Dissonance-based Deepfake Detection and Localization

来源期刊

我的账户

Not made for each other- Audio-Visual Dissonance-based Deepfake Detection and Localization

来源期刊

相似文章 10

我的账户