Detection of face-swapped fake videos using human gaze information
Authors
-
Maksim D. Krasilnikov
-
Mikhail Yu. Nikitin
-
Anton S. Konushin
Keywords:
fake videos
fake video detection
face swapping
gaze estimation
cooperative authenticity verification
computer vision
machine learning
Abstract
The relevance of fake face video detection is driven by the rapid improvement of face synthesis and face swapping techniques, which makes traditional forgery cues less robust when transferred to new generation methods and recording conditions. Common approaches based on the analysis of individual frames usually rely on local visual artifacts that may disappear after compression, re-encoding, and post-processing, and they also require a close match between the distributions of training and deployment data. Such methods do not exploit information about human behavior over time and do not make use of screen interaction scenarios. In this paper, we propose a cooperative approach in which the forgery cue is given by the calibration parameters of a gaze estimation model computed from a video track obtained while a subject sequentially fixates on points displayed on the screen. A simple classifier is trained using these parameters, and their fusion with the outputs of standard fake video detectors is also considered. It is shown that the calibration parameters alone are sufficient for reliable separation of authentic and synthesized tracks and, in several settings, outperform detectors trained only on visual features, while their fusion with the outputs of such detectors provides an additional gain in accuracy. Feature importance analysis indicates a dominant contribution of the parameters associated with the vertical component of gaze direction, which is consistent with the assumption that current generators are more vulnerable when synthesizing realistic eye appearance at extreme upward and downward gaze angles.
Section
Methods and algorithms of computational mathematics and their applications
References
- K. Krafka, A. Khosla, P. Kellnhofer, et al., “Eye Tracking for Everyone,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), USA, Las Vegas, June 27–30, 2016(IEEE, 2016), pp. 2176–2184.
doi 10.1109/CVPR.2016.239
- M. D. Krasil’nikov and M. Yu. Nikitin, “GazeT: Improved Estimation of the Three-Dimensional Vector of the Operator’s Gaze Direction,” Information processes 24 (4), 421–429 (2024).
http://www.jip.ru/2024/421-429-2024.pdf Cited May 14, 2026.
- X. Zhang, Y. Sugano, M. Fritz, and A. Bulling, “MPIIGaze: Real-World Dataset and Deep Appearance-Based Gaze Estimation,” IEEE Trans. Pattern Anal. Mach. Intell. 41 (1), 162–175 (2019).
doi 10.1109/TPAMI.2017.2778103
- K. A. Funes Mora, F. Monay, and J.-M. Odobez, “EYEDIAP: a Database for the Development and Evaluation of Gaze Estimation Algorithms from RGB and RGB-D Cameras,” in Proceedings of the Symposium on Eye Tracking Research and Applications(ACM, Safety Harbor Florida, 2014), pp. 255–258.
doi 10.1145/2578153.2578190
- X. Zhang, S. Park, T. Beeler, et al., “ETH-XGaze: A Large Scale Dataset for Gaze Estimation Under Extreme Head Pose and Gaze Variation,” in Computer Vision –– ECCV 2020. Lecture Notes in Computer Science. Vol 12350. (Springer International Publishing, Cham, 2020), pp. 365–381.
doi 10.1007/978-3-030-58558-7_22
- P. Kellnhofer, A. Recasens, S. Stent, et al., “Gaze360: Physically Unconstrained Gaze Estimation in the Wild,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Korea (South), Seoul, October 27 – November 02, 2019(IEEE, 2020), pp. 6911–6920.
doi 10.1109/ICCV.2019.00701
- N. H. Jabber and I. A. Hashim, “Robust Eye Features Extraction Based on Eye Angles for Efficient Gaze Classification System,” in 2018 Third Scientific Conference of Electrical Engineering (SCEE), Iraq, Baghdad, December 19–20, 2018(IEEE, 2019), pp. 13–18.
doi 10.1109/SCEE.2018.8684107
- F. Vicente, Z. Huang, X. Xiong, et al., “Driver Gaze Tracking and Eyes Off the Road Detection System,” IEEE Trans. Intell. Transport. Syst. 16 (4), 2014–2027 (2015).
doi 10.1109/TITS.2015.2396031
- A. George and A. Routray, “Real-Time Eye Gaze Direction Classification Using Convolutional Neural Network,” in 2016 International Conference on Signal Processing and Communications (SPCOM), India, Bangalore, June 12–15, 2016(IEEE, 2016), pp. 1–5.
doi 10.1109/SPCOM.2016.7746701
- S. Park, A. Spurr, and O. Hilliges, “Deep Pictorial Gaze Estimation,” in Computer Vision – ECCV 2018. Lecture Notes in Computer Science. Vol 11217. (Springer International Publishing, Cham, 2018), pp. 741–757.
doi 10.1007/978-3-030-01261-8_44
- X. Zhou, J. Jiang, Q. Liu, et al., “Learning a 3D Gaze Estimator with Adaptive Weighted Strategy,” IEEE Access 8, 82142–82152 (2020).
doi 10.1109/ACCESS.2020.2990685
- Z. Chen and B. E. Shi, “Appearance-Based Gaze Estimation Using Dilated-Convolutions,” in Computer Vision – ACCV 2018. Lecture Notes in Computer Science. Vol. 11366. (Springer International Publishing, Cham, 2019), pp. 309–324.
doi 10.1007/978-3-030-20876-9_20
- Y. Cheng, S. Huang, F. Wang, et al., “A Coarse-to-Fine Adaptive Network for Appearance-Based Gaze Estimation,” AAAI-20 Technical Tracks 7. 34 (07), 10623–10630 (2020).
doi 10.1609/aaai.v34i07.6636
- J. He, K. Pham, N. Valliappan, et al., “On-Device Few-Shot Personalization for Real-Time Gaze Estimation,” in 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), Korea (South), Seoul, October 27–28, 2019(IEEE, 2020), pp. 1149–1158.
doi 10.1109/ICCVW.2019.00146
- A. Rössler, D. Cozzolino, L. Verdoliva, et al., “FaceForensics++: Learning to Detect Manipulated Facial Images,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Korea (South), Seoul, October 27 – November 02, 2019(IEEE, 2020), pp. 1–11.
doi 10.1109/ICCV.2019.00009
- Y. Li, X. Yang, P. Sun, et al., “Celeb-DF: A Large-Scale Challenging Dataset for DeepFake Forensics,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), USA, Seattle, June 13–19, 2020(IEEE, 2020), pp. 3204–3213.
doi 10.1109/CVPR42600.2020.00327
- B. Dolhansky, J. Bitton, B. Pflaum, et al., “The DeepFake Detection Challenge (DFDC) Dataset,”
doi 10.48550/ARXIV.2006.07397
- J. Pu, N. Mangaokar, L. Kelly, et al., “Deepfake Videos in the Wild: Analysis and Detection,” in Proceedings of the Web Conference, Slovenia, Ljubljana, April 19–23, 2021(ACM, 2021), pp. 981–992.
doi 10.1145/3442381.3449978
- Y. He, B. Gan, S. Chen, et al., “ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), USA, Nashville, June 20–25, 2021(IEEE, 2021), pp. 4358–4367.
doi 10.1109/CVPR46437.2021.00434
- D. Afchar, V. Nozick, J. Yamagishi, and I. Echizen, “MesoNet: a Compact Facial Video Forgery Detection Network,” in 2018 IEEE International Workshop on Information Forensics and Security (WIFS), China, Hong Kong, December 11–13, 2018(IEEE, 2019), pp. 1–7.
doi 10.1109/WIFS.2018.8630761
- H. H. Nguyen, J. Yamagishi, and I. Echizen, “Capsule-forensics: Using Capsule Networks to Detect Forged Images and Videos,” in ICASSP 2019 – 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, May 12–17, 2019(IEEE, 2019), pp. 2307–2311.
doi 10.1109/ICASSP.2019.8682602
- Y. Qian, G. Yin, L. Sheng, et al., “Thinking in Frequency: Face Forgery Detection by Mining Frequency-Aware Clues,” in Computer Vision – ECCV 2020, Lecture Notes in Computer Science. Vol. 12357. (Springer International Publishing, Cham, 2020), pp. 86–103.
doi 10.1007/978-3-030-58610-2_6
- H. Dang, F. Liu, J. Stehouwer, et al., “On the Detection of Digital Face Manipulation,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), USA, Seattle, June 13–19, 2020(IEEE, 2020), pp. 5780–5789.
doi 10.1109/CVPR42600.2020.00582
- Y. Luo, Y. Zhang, J. Yan, and W. Liu, “Generalizing Face Forgery Detection with High-frequency Features,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), USA, Nashville, June 20–25, 2021(IEEE, 2021), pp. 16312–16321.
doi 10.1109/CVPR46437.2021.01605
- H. Liu, X. Li, W. Zhou, et al., “Spatial-Phase Shallow Learning: Rethinking Face Forgery Detection in Frequency Domain,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), USA, Nashville, June 20–25, 2021(IEEE, 2021), pp. 772–781.
doi 10.1109/CVPR46437.2021.00083
- Z. Yan, Y. Zhang, Y. Fan, and B. Wu, UCF: Uncovering Common Features for Generalizable Deepfake Detection.
doi 10.48550/ARXIV.2304.13949
- J. Cao, C. Ma, T. Yao, et al., “End-to-End Reconstruction-Classification Learning for Face Forgery Detection,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), USA, New Orleans, June 18–24, 2022(IEEE, 2022), pp. 4103–4112.
doi 10.1109/CVPR52688.2022.00408
- C. R. Gerstner and H. Farid, “Detecting Real-Time Deep-Fake Videos Using Active Illumination,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), USA, New Orleans, June 19–20, 2022(IEEE, 2022), pp. 53–60.
doi 10.1109/CVPRW56347.2022.00015
- H. Guo, X. Wang, and S. Lyu, “Detection of Real-Time Deepfakes in Video Conferencing with Active Probing and Corneal Reflection,” in ICASSP 2023 – 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Greece, Rhodes Island, June 04–10, 2023(IEEE, 2023), pp. 1–5.
doi 10.1109/ICASSP49357.2023.10094720
- G. Mittal, C. Hegde, and N. Memon, “Gotcha: Real-Time Video Deepfake Detection via Challenge-Response,” in 2024 IEEE 9th European Symposium on Security and Privacy (EuroS&P), Austria, Vienna, July 08–12, 2024(IEEE, 2024), pp. 1–20.
doi 10.1109/EuroSP60621.2024.00009
- R. Chen, X. Chen, B. Ni, and Y. Ge, “SimSwap: An Efficient Framework For High Fidelity Face Swapping,” in Proceedings of the 28th ACM International Conference on Multimedia, USA, Seattle, October 12–16, 2020(ACM, 2020), pp. 2003–2011.
doi 10.48550/arXiv.2106.06340
- K. Estanislao, “Deep-Live-Cam.’’
https://github.com/hacksider/Deep-Live-Cam Cited May 16, 2026.