DOI: https://doi.org/10.26089/NumMet.v26r323

A software system for automated assessment of stereoscopic distortions in VR180 videos

Authors

  • Sergey V. Lavrushkin

Keywords:

stereoscopic distortions
stereoscopic video
VR180
color mismatch
sharpness mismatch
geometric distortions
channel mismatch
deep learning

Abstract

The paper presents a comprehensive software system for automatically evaluating stereoscopic distortions in VR180 video. The proposed approach takes into account the most common types of artifacts: color mismatches, differences in sharpness, geometric distortions (vertical shift, rotation, scaling), and channel mismatches. Specialized algorithms have been developed for each type of distortion, based on disparity maps, motion vectors and their confidence maps, as well as neural network regression or classification methods. The proposed solutions have been successfully tested on several datasets, demonstrating high accuracy in detecting various types of distortions in VR180 video. The system can be integrated into standard post-processing pipelines and provides automated generation of detailed reports, allowing stereoscopic content creators to quickly identify and eliminate stereoscopic distortions before releasing their products to a wide audience.



Downloads

Published

2025-09-20

Issue

Section

Methods and algorithms of computational mathematics and their applications

Author

Sergey V. Lavrushkin


References

  1. A. Antsiferova and D. Vatolin, “The Influence of 3D Video Artifacts on Discomfort of 302 Viewers,” in Proc. 2017 Int. Conf. on 3D Immersion (IC3D), Brussels, Belgium, December 11-12, 2017 (IEEE Press, New York, 2017), pp. 1-8.
    doi 10.1109/IC3D.2017.8251897
  2. G. I. Rozhkova and N. N. Vasilyeva, “Comparative Perceptual Difficulties Associated with Viewing Films in 2D and 3D Format,” World of Technique of Cinema 4 (2), 12-18 (2010) [in Russian].
  3. G. I. Rozhkova and S. V. Alekseenko, “Visual Discomfort in Conditions of Stereoscopic Image Perception as a Consequence of Unusual Distribution of Loads on Different Mechanisms of Visual System,” World of Technique of Cinema 5 (3), 12-21 (2011) [in Russian].
  4. N. N. Vasilyeva, G. I. Rozhkova, and S. N. Rozhkov, “On the Good and Harm from the Modern Technologies of Creating Cinematographic Stereo Images for the People with Different States of Visual Functions,” World of Technique of Cinema 5 (1), 7-15 (2011) [in Russian].
  5. S. N. Rozhkov and G. I. Rozhkova, “Distortions of Spatial Images in Stereo Movies: Illusions of Object Diminution, Enlargement and Flattening,” World of Technique of Cinema 7 (3), 13-20 (2013) [in Russian].
  6. D. M. Hoffman, A. R. Girshick, K. Akeley, and M. S. Banks, “Vergence-Accommodation Conflicts Hinder Visual Performance and Cause Visual Fatigue,” J. Vis. 8 (3), Article Number 33 (2008).
    doi 10.1167/8.3.33
  7. D. Khaustova, J. Fournier, E. Wyckens, and O. Le Meur, “An Objective Method for 3D Quality Prediction Using Visual Annoyance and Acceptability Level,” in Stereoscopic Displays and Applications XXVI , Vol. 9391, 2015.
    doi 10.1117/12.2076949
  8. A. Bokov, S. Lavrushkin, M. Erofeev, et al., “Toward Fully Automatic Channel-Mismatch Detection and Discomfort Prediction for S3D Video,” in Proc. Int. Conf. on 3D Imaging (IC3D), Liege, Belgium, December 13-14, 2016 (IEEE Press, New York, 2017), pp. 1-7.
    doi 10.1109/IC3D.2016.7823462
  9. S. Lavrushkin, V. Lyudvichenko, and D. Vatolin, “Local Method of Color-Difference Correction between Stereoscopic-Video Views,” in Proc. 2018 3DTV Conf. on the True Vision -- Capture, Transmission and Display of 3D Video (3DTV-CON), Helsinki, Finland, June 3-5, 2018 (IEEE Press, New York, 2018), pp. 1-4.
    doi 10.1109/3DTV.2018.8478453
  10. S. Lavrushkin and D. Vatolin, “Channel-Mismatch Detection Algorithm for Stereoscopic Video Using Convolutional Neural Network,” in Proc. 2018 3DTV Conf. on the True Vision -- Capture, Transmission and Display of 3D Video (3DTV-CON), Helsinki, Finland, June 3-5, 2018 (IEEE Press, New York, 2018), pp. 1-4.
    doi 10.1109/3DTV.2018.8478542
  11. S. Lavrushkin, K. Kozhemyakov, and D. Vatolin, “Neural-Network-Based Detection Methods for Color, Sharpness, and Geometry Artifacts in Stereoscopic and VR180 Videos,” in Proc. 2020 Int. Conf. on 3D Immersion (IC3D), Brussels, Belgium, December 15, 2020 (IEEE Press, New York, 2020), pp. 1-8.
    doi 10.1109/IC3D51119.2020.9376385
  12. S. Lavrushkin, I. Molodetskikh, K. Kozhemyakov, and D. Vatolin, “Stereoscopic Quality Assessment of 1, 000 VR180 Videos Using 8 Metrics,” J. Electron. Imaging N 2, 350-1-350-7 (2021).
    doi 10.2352/ISSN.2470-1173.2021.2.SDA-350
  13. A. P. Pentland, “A New Sense for Depth of Field,” IEEE Trans. Pattern Anal. Mach. Intell. 9 (4), 523-531 (1987).
    doi 10.1109/TPAMI.1987.4767940
  14. J. H. Elder and S. W. Zucker, “Local Scale Control for Edge Detection and Blur Estimation,” IEEE Trans. Pattern Anal. Mach. Intell. 20 (7), 699-716 (1998).
    doi 10.1109/34.689301
  15. S. Zhuo and T. Sim, “Defocus Map Estimation from a Single Image,” Pattern Recognit. 44 (9), 1852-1858 (2011).
    doi 10.1016/j.patcog.2011.03.009
  16. Y. Cao, S. Fang, and Z. Wang, “Digital Multi-Focusing from a Single Photograph Taken with an Uncalibrated Conventional Camera,” IEEE Trans. Image Process. 22 (9), 3703-3714 (2013).
    doi 10.1109/TIP.2013.2270086
  17. A. Karaali and C. R. Jung, “Adaptive Scale Selection for Multiresolution Defocus Blur Estimation,” in Proc. 2014 IEEE Int. Conf. on Image Processing (ICIP), Paris, France, October 27-30, 2014 (IEEE Press, New York, 2014), pp. 4597-4601.
    doi 10.1109/ICIP.2014.7025932
  18. A. Karaali and C. R. Jung, “Edge-Based Defocus Blur Estimation with Adaptive Scale Selection,” IEEE Trans. Image Process. 27 (3), 1126-1137 (2018).
    doi 10.1109/TIP.2017.2771563
  19. A. Chakrabarti, T. Zickler, and W. T. Freeman, “Analyzing Spatially-Varying Blur,” in Proc. 2010 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition (CVPR) San Francisco, CA, USA, June 13-18, 2010 (IEEE Press, New York, 2010), pp. 2512-2519.
    doi 10.1109/CVPR.2010.5539954
  20. X. Zhu, S. Cohen, S. Schiller, and P. Milanfar, “Estimating Spatially Varying Defocus Blur from a Single Image,” IEEE Trans. Image Process. 22 (12), 4879-4891 (2013).
    doi 10.1109/TIP.2013.2279316
  21. L. D’Andrès, J. Salvador, A. Kochale, and S. Süsstrunk, “Non-Parametric Blur Map Regression for Depth of Field Extension,” IEEE Trans. Image Process. 25 (4), 1660-1673 (2016).
    doi 10.1109/TIP.2016.2526907
  22. S. A. Golestaneh and L. J. Karam, “Spatially-Varying Blur Detection Based on Multiscale Fused and Sorted Transform Coefficients of Gradient Magnitudes,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Honolulu, HI, USA, July 21-26, 2017 (IEEE Press, New York, 2017), pp. 596-605.
    doi 10.1109/CVPR.2017.71
  23. N. D. Narvekar and L. J. Karam, “A No-Reference Image Blur Metric Based on the Cumulative Probability of Blur Detection (CPBD),” IEEE Trans. Image Process. 20 (9), 2678-2683 (2011).
    doi 10.1109/TIP.2011.2131660
  24. J. Kumar, F. Chen, and D. Doermann, “Sharpness Estimation for Document and Scene Images,” in Proc. 21st Int. Conf. on Pattern Recognition (ICPR), Tsukuba, Japan, 2012 , pp. 3292-3295.
  25. K. Zeng, Y. Wang, J. Mao, et al., “A Local Metric for Defocus Blur Detection Based on CNN Feature Learning,” IEEE Trans. Image Process. 28 (5), 2107-2115 (2019).
    doi 10.1109/TIP.2018.2881830
  26. J. Park, Y.-W. Tai, D. Cho, and I. So Kweon, “A Unified Approach of Multi-Scale Deep and Hand-Crafted Features for Defocus Estimation,”
    https://arxiv.org/abs/1704.08992 . Cited August 28, 2025.
  27. J. Lee, S. Lee, S. Cho, and S. Lee, “Deep Defocus Map Estimation Using Domain Adaptation,” in Proc. IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, June 15-20, 2019 (IEEE Press, New York, 2019), pp. 12214-12222.
    doi 10.1109/CVPR.2019.01250
  28. C. Tang, X. Liu, X. Zhu, et al., “R²MRF: Defocus Blur Detection via Recurrently Refining Multi-Scale Residual Features,” in Proc. AAAI Conf. on Artificial Intelligence , Vol. 34 (07), pp. 12063-12070 (2020).
    doi 10.1609/aaai.v34i07.6884
  29. X. Cun and C.-M. Pun, “Defocus Blur Detection via Depth Distillation,” in Proc. European Conf. on Computer Vision (ECCV) , pp. 747-763 (2020).
    doi 10.48550/arXiv.2007.08113
    doi 10.1007/978-3-030-58601-0_44
    https://arxiv.org/abs/2007.08113 . Cited August 30, 2025.
  30. A. Karaali, N. Harte, and C. R. Jung, “Deep Multi-Scale Feature Learning for Defocus Blur Estimation,” IEEE Trans. Image Process. 31, 1097-1106 (2022).
    doi 10.1109/TIP.2021.3139243
  31. H. Li, W. Qian, J. Cao, et al., “Multi-Interactive Enhanced for Defocus Blur Estimation,” IEEE Trans. Comput. Imaging 10, 640-652 (2024).
    doi 10.1109/TCI.2024.3354427
  32. Z. Zhao, H. Yang, P. Liu, et al., “Defocus Blur Detection via Adaptive Cross-Level Feature Fusion and Refinement,” Vis. Comput. 40 (11), 8141-8153 (2024).
    doi 10.1007/s00371-023-03229-7
  33. S. Winkler, “Efficient Measurement of Stereoscopic 3D Video Content Issues,” in Proc. SPIE 9016, Image Quality and System Performance XI , page 90160Q (2014).
    doi 10.1117/12.2042211
  34. Q. Dong, T. Zhou, Z. Guo, and J. Xiao, “A Stereo Camera Distortion Detecting Method for 3DTV Video Quality Assessment,” in Proc. 2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conf. (APSIPA ASC), Kaohsiung, Taiwan, October 29-November 1, 2013 , (IEEE Press, New York, 2014), pp. 1-4.
    doi 10.1109/APSIPA.2013.6694209
  35. F. Devernay, S. Pujades, and Vijay Ch.A.V, “Focus Mismatch Detection in Stereoscopic Content,” in Proc. SPIE 8288. Stereoscopic Displays and Applications XXIII, page 82880E (2012).
    doi 10.1117/12.906209
  36. M. Liu and K. Müller, “Automatic Analysis of Sharpness Mismatch between Stereoscopic Views for Stereo 3D Videos,” in Proc. 2014 Int. Conf. on 3D Imaging (IC3D), Liege, Belgium, December 9-10, 2014 , (IEEE Press, New York, 2015), pp. 1-6.
    doi 10.1109/IC3D.2014.7032572
  37. D. Vatolin and A. Bokov, “Sharpness Mismatch and 6 Other Stereoscopic Artifacts Measured on 10 Chinese S3D Movies,” J. Electron. Imaging 2017 (5), 137-144 (2017).
    doi 10.2352/ISSN.2470-1173.2017.5.SDA-340
  38. D. Vatolin, A. Bokov, M. Erofeev, and V. Napadovsky, “Trends in S3D-Movie Quality Evaluated on 105 Films Using 10 Metrics,” J. Electron. Imaging 2016 (5), 1-10 (2016).
    doi 10.2352/ISSN.2470-1173.2016.5.SDA-439
  39. C. Doutre, M. T. Pourazad, A. Tourapis, et al., “Correcting Unsynchronized Zoom in 3D Video,” in Proc. 2010 IEEE Int. Symposium on Circuits and Systems (ISCAS), Paris, France, May 30-June 2, 2010 (IEEE Press, New York, 2010), pp. 3244-3247.
    doi 10.1109/ISCAS.2010.5537923
  40. D. G. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints,” Int. J. Comput. Vis. 60 (2), 91-110 (2004).
    doi 10.1023/B: VISI.0000029664.99615.94.
  41. I. E. Pekkucuksen, A. U. Batur, and B. Zhang, “A Real-Time Misalignment Correction Algorithm for Stereoscopic 3D Cameras,” in Proc. SPIE 8288, Stereoscopic Displays and Applications XXIII , page 82880J (2012).
    doi 10.1117/12.906902
  42. A. Voronov, A. Borisov, and D. Vatolin, “System for Automatic Detection of Distorted Scenes in Stereo Video,” in Proc. Sixth Int. Workshop on Video Processing and Quality Metrics (VPQM) , 2012.
  43. M. A. Fischler and R. C. Bolles, “Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography,” Commun. ACM 24 (6), 381-395 (1981).
    doi 10.1145/358669.358692
  44. E. Brachmann, A. Krull, S. Nowozin, et al., “DSAC -- Differentiable RANSAC for Camera Localization,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, July 21-26, 2017 (IEEE Press, New York, 2017), pp. 2492-2500.
    doi 10.1109/CVPR.2017.267
  45. E. Brachmann and C. Rother, “Neural-Guided RANSAC: Learning Where to Sample Model Hypotheses,” in Proc. IEEE/CVF Int. Conf. on Computer Vision (ICCV), Seoul, Korea (South), October 27, 2019 (IEEE Press, New York, 2019), pp. 4321-4330.
    doi 10.1109/ICCV.2019.00442
  46. F. Kluger and B. Rosenhahn, “PARSAC: Accelerating Robust Multi-Model Fitting with Parallel Sample Consensus,” in Proc. AAAI Conf. on Artificial Intelligence , 38 (3), 2804-2812 (2024).
    doi 10.1609/aaai.v38i3.28060
  47. J. Chen, Y. Gu, and L. Luo, “Learning to Find Good Correspondences Based on Global and Local Attention Mechanism,” in Proc. 2021 China Automation Congress (CAC), Beijing, China, October 22-24, 2021 (IEEE Press, New York, 2022), pp. 2174-2178.
    doi 10.1109/CAC53003.2021.9728340
  48. W. Sun, W. Jiang, E. Trulls, et al., “ACNe: Attentive Context Normalization for Robust Permutation-Equivariant Learning,” in Proc. IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, June 13-19, 2020 (IEEE Press, New York, 2020), pp. 11283-11292.
    doi 10.1109/CVPR42600.2020.01130
  49. I. Rocco, R. Arandjelovi’c, and J. Sivic, “Convolutional Neural Network Architecture for Geometric Matching,” IEEE Trans. Pattern Anal. Mach. Intell. 41 (11), 2553-2567 (2019).
    doi 10.1109/TPAMI.2018.2865351
  50. I. Rocco, R. Arandjelovi’c, and J. Sivic, “End-to-End Weakly-Supervised Semantic Alignment,” in Proc. 2018 IEEE/CVF Conf. on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, June 18-23, 2018 (IEEE Press, New York, 2018), pp. 6917-6925.
    doi 10.1109/CVPR.2018.00723
  51. J. Lee, C. Jung, C. Kim, and A. Said, “Content-Based Pseudoscopic View Detection,” J. Sign. Process. Syst. 68 (2), 261-271 (2012).
    doi 10.1007/s11265-011-0608-8
  52. G. Palou and P. Salembier, “Monocular Depth Ordering Using T-Junctions and Convexity Occlusion Cues,” IEEE Trans. Image Process. 22 (5), 1926-1939 (2013).
    doi 10.1109/TIP.2013.2240002
  53. H. Lee, C. Jung, and C. Kim, “Depth Map Estimation Based on Geometric Scene Categorization,” in Proc. 19th Korea-Japan Joint Workshop on Frontiers of Computer Vision (FCV), Incheon, Korea (South), January 30, 2013 , (IEEE Press, New York, 2013), pp. 170-173.
    doi 10.1109/FCV.2013.6485482
  54. D. Hoiem, A. N. Stein, A. A. Efros, and M. Hebert, “Recovering Occlusion Boundaries from a Single Image,” in Proc 2007 IEEE 11th Int. Conf. on Computer Vision, Rio de Janeiro, Brazil, October 14-21, 2007 (IEEE Press, New York, 2007), pp. 1-8.
    doi 10.1109/ICCV.2007.4408985
  55. H. Fu, M. Gong, C. Wang, et al., “Deep Ordinal Regression Network for Monocular Depth Estimation,”
    doi 10.1109/CVPR.2018.00214
    https://arxiv.org/abs/1806.02446 . Cited August 29, 2025.
  56. C. Godard, O. Mac Aodha, M. Firman, and G. Brostow, “Digging into Self-Supervised Monocular Depth Estimation,” in Proc. IEEE/CVF Int. Conf. on Computer Vision (ICCV), Seoul, Korea (South), October 27-November 2 2019 (IEEE Press, New York, 2019), pp. 3827-3837.
    doi 10.1109/ICCV.2019.00393
  57. R. Ranftl, K. Lasinger, D. Hafner, et al., “Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-Shot Cross-Dataset Transfer,” IEEE Trans. Pattern Anal. Mach. Intell. 44 (3), 1623-1637 (2022).
    doi 10.1109/TPAMI.2020.3019967
  58. W. Yin, J. Zhang, O. Wang, et al., “Learning to Recover 3D Scene Shape from a Single Image,” in Proc. 2021 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, June 20-25, 2021 (IEEE Press, New York, 2021), pp. 204-213.
    doi 10.1109/CVPR46437.2021.00027
  59. L. Yang, B. Kang, Z. Huang, et al., “Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data,” in Proc. IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, June 16-22, 2024 (IEEE Press, New York, 2024), pp. 10371-10381.
    doi 10.1109/CVPR52733.2024.00987
  60. M. Knee, “Getting Machines to Watch 3D for You,” SMPTE Motion Imaging J. 121 (3), 52-58 (2012).
    doi 10.5594/j18162
  61. J. Bouchard, Y. Nazzar, and J. J. Clark, “Half-Occluded Regions and Detection of Pseudoscopy,” in Proc. 2015 Int. Conf. on 3D Vision (3DV), Lyon, France, October 19-22, 2015 (IEEE Press, New York, 2015), pp. 215-223.
    doi 10.1109/3DV.2015.32
  62. A. Shestov, A. Voronov, and D. Vatolin, “Detection of Swapped Views in Stereo Image,” in Proc. 22nd GraphiCon Int. Conf. on Computer Graphics and Vision , pp. 23-27 (2012).
  63. K. Simonyan, S. Grishin, D. Vatolin, and D. Popov, “Fast Video Super-Resolution via Classification,” in Proc. 2008 15th IEEE Int. Conf. on Image Processing (ICIP), San Diego, CA, USA, October 12-15, 2008 (IEEE Press, New York, 2008), pp. 349-352.
    doi 10.1109/ICIP.2008.4711763
  64. G. Egnal and R. P. Wildes, “Detecting Binocular Half-Occlusions: Empirical Comparisons of Five Approaches,” IEEE Trans. Pattern Anal. Mach. Intell. 24 (8), 1127-1133 (2002).
    doi 10.1109/TPAMI.2002.1023808
  65. D. Fourure, R. Emonet, E. Fromont, et al., “Residual Conv-Deconv Grid Network for Semantic Segmentation,”
    https://arxiv.org/abs/1707.07958 . Cited August 29, 2025.
  66. K. He, X. Zhang, S. Ren, and J. Sun, “Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification,” in Proc. 2015 IEEE Int. Conf. on Computer Vision (ICCV), Santiago, Chile, December 7-13, 2015 (IEEE Press, New York, 2016), pp. 1026-1034.
    doi 10.1109/ICCV.2015.123
  67. D. Min, S. Choi, J. Lu, et al., “Fast Global Image Smoothing Based on Weighted Least Squares,” IEEE Trans. Image Process. 23 (12), 5638-5653 (2014).
    doi 10.1109/TIP.2014.2366600
  68. X. Glorot and Y. Bengio, “Understanding the Difficulty of Training Deep Feedforward Neural Networks,” in Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS) , pp. 249-256 (2010).
  69. D. P. Kingma and J. L. Ba, “Adam: A Method for Stochastic Optimization,”
    https://arxiv.org/pdf/1412.6980 . Cited August 29, 2025.
  70. K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” in Proc. 2016 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, June 27-30, 2016 (IEEE Press, New York, 2016), pp. 770-778.
    doi 10.1109/CVPR.2016.90
  71. S. Ioffe and C. Szegedy, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift,”
    https://arxiv.org/abs/1502.03167 . Cited August 29, 2025.
  72. D. S. Vatolin and S. V. Lavrushkin, “Investigating and Predicting the Perceptibility Protect of Channel Mismatch in Stereoscopic Video,” Vestn. Mosk. Univ., Ser. 15: Vychisl. Mat. Kibern., No. 4, 40-46 (2016) [Moscow Univ. Comput. Math. Cybern. 40 (4), 185-191 (2016)].
    doi 10.3103/S0278641916040075
  73. D. J. Butler, J. Wulff, G. B. Stanley, and M. J. Black, “A Naturalistic Open Source Movie for Optical Flow Evaluation,” in Lecture Notes in Computer Science (Springer, Berlin, 2012), Vol. 7577, pp. 611-625.
    doi 10.1007/978-3-642-33783-3_44