A software system for automated assessment of stereoscopic distortions in VR180 videos
Authors
-
Sergey V. Lavrushkin
Keywords:
stereoscopic distortions
stereoscopic video
VR180
color mismatch
sharpness mismatch
geometric distortions
channel mismatch
deep learning
Abstract
The paper presents a comprehensive software system for automatically evaluating stereoscopic distortions in VR180 video. The proposed approach takes into account the most common types of artifacts: color mismatches, differences in sharpness, geometric distortions (vertical shift, rotation, scaling), and channel mismatches. Specialized algorithms have been developed for each type of distortion, based on disparity maps, motion vectors and their confidence maps, as well as neural network regression or classification methods. The proposed solutions have been successfully tested on several datasets, demonstrating high accuracy in detecting various types of distortions in VR180 video. The system can be integrated into standard post-processing pipelines and provides automated generation of detailed reports, allowing stereoscopic content creators to quickly identify and eliminate stereoscopic distortions before releasing their products to a wide audience.
Section
Methods and algorithms of computational mathematics and their applications
References
- A. Antsiferova and D. Vatolin, “The Influence of 3D Video Artifacts on Discomfort of 302 Viewers,” in Proc. 2017 Int. Conf. on 3D Immersion (IC3D), Brussels, Belgium, December 11-12, 2017 (IEEE Press, New York, 2017), pp. 1-8.
doi 10.1109/IC3D.2017.8251897
- G. I. Rozhkova and N. N. Vasilyeva, “Comparative Perceptual Difficulties Associated with Viewing Films in 2D and 3D Format,” World of Technique of Cinema 4 (2), 12-18 (2010) [in Russian].
- G. I. Rozhkova and S. V. Alekseenko, “Visual Discomfort in Conditions of Stereoscopic Image Perception as a Consequence of Unusual Distribution of Loads on Different Mechanisms of Visual System,” World of Technique of Cinema 5 (3), 12-21 (2011) [in Russian].
- N. N. Vasilyeva, G. I. Rozhkova, and S. N. Rozhkov, “On the Good and Harm from the Modern Technologies of Creating Cinematographic Stereo Images for the People with Different States of Visual Functions,” World of Technique of Cinema 5 (1), 7-15 (2011) [in Russian].
- S. N. Rozhkov and G. I. Rozhkova, “Distortions of Spatial Images in Stereo Movies: Illusions of Object Diminution, Enlargement and Flattening,” World of Technique of Cinema 7 (3), 13-20 (2013) [in Russian].
- D. M. Hoffman, A. R. Girshick, K. Akeley, and M. S. Banks, “Vergence-Accommodation Conflicts Hinder Visual Performance and Cause Visual Fatigue,” J. Vis. 8 (3), Article Number 33 (2008).
doi 10.1167/8.3.33
- D. Khaustova, J. Fournier, E. Wyckens, and O. Le Meur, “An Objective Method for 3D Quality Prediction Using Visual Annoyance and Acceptability Level,” in Stereoscopic Displays and Applications XXVI , Vol. 9391, 2015.
doi 10.1117/12.2076949
- A. Bokov, S. Lavrushkin, M. Erofeev, et al., “Toward Fully Automatic Channel-Mismatch Detection and Discomfort Prediction for S3D Video,” in Proc. Int. Conf. on 3D Imaging (IC3D), Liege, Belgium, December 13-14, 2016 (IEEE Press, New York, 2017), pp. 1-7.
doi 10.1109/IC3D.2016.7823462
- S. Lavrushkin, V. Lyudvichenko, and D. Vatolin, “Local Method of Color-Difference Correction between Stereoscopic-Video Views,” in Proc. 2018 3DTV Conf. on the True Vision -- Capture, Transmission and Display of 3D Video (3DTV-CON), Helsinki, Finland, June 3-5, 2018 (IEEE Press, New York, 2018), pp. 1-4.
doi 10.1109/3DTV.2018.8478453
- S. Lavrushkin and D. Vatolin, “Channel-Mismatch Detection Algorithm for Stereoscopic Video Using Convolutional Neural Network,” in Proc. 2018 3DTV Conf. on the True Vision -- Capture, Transmission and Display of 3D Video (3DTV-CON), Helsinki, Finland, June 3-5, 2018 (IEEE Press, New York, 2018), pp. 1-4.
doi 10.1109/3DTV.2018.8478542
- S. Lavrushkin, K. Kozhemyakov, and D. Vatolin, “Neural-Network-Based Detection Methods for Color, Sharpness, and Geometry Artifacts in Stereoscopic and VR180 Videos,” in Proc. 2020 Int. Conf. on 3D Immersion (IC3D), Brussels, Belgium, December 15, 2020 (IEEE Press, New York, 2020), pp. 1-8.
doi 10.1109/IC3D51119.2020.9376385
- S. Lavrushkin, I. Molodetskikh, K. Kozhemyakov, and D. Vatolin, “Stereoscopic Quality Assessment of 1, 000 VR180 Videos Using 8 Metrics,” J. Electron. Imaging N 2, 350-1-350-7 (2021).
doi 10.2352/ISSN.2470-1173.2021.2.SDA-350
- A. P. Pentland, “A New Sense for Depth of Field,” IEEE Trans. Pattern Anal. Mach. Intell. 9 (4), 523-531 (1987).
doi 10.1109/TPAMI.1987.4767940
- J. H. Elder and S. W. Zucker, “Local Scale Control for Edge Detection and Blur Estimation,” IEEE Trans. Pattern Anal. Mach. Intell. 20 (7), 699-716 (1998).
doi 10.1109/34.689301
- S. Zhuo and T. Sim, “Defocus Map Estimation from a Single Image,” Pattern Recognit. 44 (9), 1852-1858 (2011).
doi 10.1016/j.patcog.2011.03.009
- Y. Cao, S. Fang, and Z. Wang, “Digital Multi-Focusing from a Single Photograph Taken with an Uncalibrated Conventional Camera,” IEEE Trans. Image Process. 22 (9), 3703-3714 (2013).
doi 10.1109/TIP.2013.2270086
- A. Karaali and C. R. Jung, “Adaptive Scale Selection for Multiresolution Defocus Blur Estimation,” in Proc. 2014 IEEE Int. Conf. on Image Processing (ICIP), Paris, France, October 27-30, 2014 (IEEE Press, New York, 2014), pp. 4597-4601.
doi 10.1109/ICIP.2014.7025932
- A. Karaali and C. R. Jung, “Edge-Based Defocus Blur Estimation with Adaptive Scale Selection,” IEEE Trans. Image Process. 27 (3), 1126-1137 (2018).
doi 10.1109/TIP.2017.2771563
- A. Chakrabarti, T. Zickler, and W. T. Freeman, “Analyzing Spatially-Varying Blur,” in Proc. 2010 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition (CVPR) San Francisco, CA, USA, June 13-18, 2010 (IEEE Press, New York, 2010), pp. 2512-2519.
doi 10.1109/CVPR.2010.5539954
- X. Zhu, S. Cohen, S. Schiller, and P. Milanfar, “Estimating Spatially Varying Defocus Blur from a Single Image,” IEEE Trans. Image Process. 22 (12), 4879-4891 (2013).
doi 10.1109/TIP.2013.2279316
- L. D’Andrès, J. Salvador, A. Kochale, and S. Süsstrunk, “Non-Parametric Blur Map Regression for Depth of Field Extension,” IEEE Trans. Image Process. 25 (4), 1660-1673 (2016).
doi 10.1109/TIP.2016.2526907
- S. A. Golestaneh and L. J. Karam, “Spatially-Varying Blur Detection Based on Multiscale Fused and Sorted Transform Coefficients of Gradient Magnitudes,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Honolulu, HI, USA, July 21-26, 2017 (IEEE Press, New York, 2017), pp. 596-605.
doi 10.1109/CVPR.2017.71
- N. D. Narvekar and L. J. Karam, “A No-Reference Image Blur Metric Based on the Cumulative Probability of Blur Detection (CPBD),” IEEE Trans. Image Process. 20 (9), 2678-2683 (2011).
doi 10.1109/TIP.2011.2131660
- J. Kumar, F. Chen, and D. Doermann, “Sharpness Estimation for Document and Scene Images,” in Proc. 21st Int. Conf. on Pattern Recognition (ICPR), Tsukuba, Japan, 2012 , pp. 3292-3295.
- K. Zeng, Y. Wang, J. Mao, et al., “A Local Metric for Defocus Blur Detection Based on CNN Feature Learning,” IEEE Trans. Image Process. 28 (5), 2107-2115 (2019).
doi 10.1109/TIP.2018.2881830
- J. Park, Y.-W. Tai, D. Cho, and I. So Kweon, “A Unified Approach of Multi-Scale Deep and Hand-Crafted Features for Defocus Estimation,”
https://arxiv.org/abs/1704.08992 . Cited August 28, 2025.
- J. Lee, S. Lee, S. Cho, and S. Lee, “Deep Defocus Map Estimation Using Domain Adaptation,” in Proc. IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, June 15-20, 2019 (IEEE Press, New York, 2019), pp. 12214-12222.
doi 10.1109/CVPR.2019.01250
- C. Tang, X. Liu, X. Zhu, et al., “R²MRF: Defocus Blur Detection via Recurrently Refining Multi-Scale Residual Features,” in Proc. AAAI Conf. on Artificial Intelligence , Vol. 34 (07), pp. 12063-12070 (2020).
doi 10.1609/aaai.v34i07.6884
- X. Cun and C.-M. Pun, “Defocus Blur Detection via Depth Distillation,” in Proc. European Conf. on Computer Vision (ECCV) , pp. 747-763 (2020).
doi 10.48550/arXiv.2007.08113
doi 10.1007/978-3-030-58601-0_44
https://arxiv.org/abs/2007.08113 . Cited August 30, 2025.
- A. Karaali, N. Harte, and C. R. Jung, “Deep Multi-Scale Feature Learning for Defocus Blur Estimation,” IEEE Trans. Image Process. 31, 1097-1106 (2022).
doi 10.1109/TIP.2021.3139243
- H. Li, W. Qian, J. Cao, et al., “Multi-Interactive Enhanced for Defocus Blur Estimation,” IEEE Trans. Comput. Imaging 10, 640-652 (2024).
doi 10.1109/TCI.2024.3354427
- Z. Zhao, H. Yang, P. Liu, et al., “Defocus Blur Detection via Adaptive Cross-Level Feature Fusion and Refinement,” Vis. Comput. 40 (11), 8141-8153 (2024).
doi 10.1007/s00371-023-03229-7
- S. Winkler, “Efficient Measurement of Stereoscopic 3D Video Content Issues,” in Proc. SPIE 9016, Image Quality and System Performance XI , page 90160Q (2014).
doi 10.1117/12.2042211
- Q. Dong, T. Zhou, Z. Guo, and J. Xiao, “A Stereo Camera Distortion Detecting Method for 3DTV Video Quality Assessment,” in Proc. 2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conf. (APSIPA ASC), Kaohsiung, Taiwan, October 29-November 1, 2013 , (IEEE Press, New York, 2014), pp. 1-4.
doi 10.1109/APSIPA.2013.6694209
- F. Devernay, S. Pujades, and Vijay Ch.A.V, “Focus Mismatch Detection in Stereoscopic Content,” in Proc. SPIE 8288. Stereoscopic Displays and Applications XXIII, page 82880E (2012).
doi 10.1117/12.906209
- M. Liu and K. Müller, “Automatic Analysis of Sharpness Mismatch between Stereoscopic Views for Stereo 3D Videos,” in Proc. 2014 Int. Conf. on 3D Imaging (IC3D), Liege, Belgium, December 9-10, 2014 , (IEEE Press, New York, 2015), pp. 1-6.
doi 10.1109/IC3D.2014.7032572
- D. Vatolin and A. Bokov, “Sharpness Mismatch and 6 Other Stereoscopic Artifacts Measured on 10 Chinese S3D Movies,” J. Electron. Imaging 2017 (5), 137-144 (2017).
doi 10.2352/ISSN.2470-1173.2017.5.SDA-340
- D. Vatolin, A. Bokov, M. Erofeev, and V. Napadovsky, “Trends in S3D-Movie Quality Evaluated on 105 Films Using 10 Metrics,” J. Electron. Imaging 2016 (5), 1-10 (2016).
doi 10.2352/ISSN.2470-1173.2016.5.SDA-439
- C. Doutre, M. T. Pourazad, A. Tourapis, et al., “Correcting Unsynchronized Zoom in 3D Video,” in Proc. 2010 IEEE Int. Symposium on Circuits and Systems (ISCAS), Paris, France, May 30-June 2, 2010 (IEEE Press, New York, 2010), pp. 3244-3247.
doi 10.1109/ISCAS.2010.5537923
- D. G. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints,” Int. J. Comput. Vis. 60 (2), 91-110 (2004).
doi 10.1023/B: VISI.0000029664.99615.94.
- I. E. Pekkucuksen, A. U. Batur, and B. Zhang, “A Real-Time Misalignment Correction Algorithm for Stereoscopic 3D Cameras,” in Proc. SPIE 8288, Stereoscopic Displays and Applications XXIII , page 82880J (2012).
doi 10.1117/12.906902
- A. Voronov, A. Borisov, and D. Vatolin, “System for Automatic Detection of Distorted Scenes in Stereo Video,” in Proc. Sixth Int. Workshop on Video Processing and Quality Metrics (VPQM) , 2012.
- M. A. Fischler and R. C. Bolles, “Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography,” Commun. ACM 24 (6), 381-395 (1981).
doi 10.1145/358669.358692
- E. Brachmann, A. Krull, S. Nowozin, et al., “DSAC -- Differentiable RANSAC for Camera Localization,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, July 21-26, 2017 (IEEE Press, New York, 2017), pp. 2492-2500.
doi 10.1109/CVPR.2017.267
- E. Brachmann and C. Rother, “Neural-Guided RANSAC: Learning Where to Sample Model Hypotheses,” in Proc. IEEE/CVF Int. Conf. on Computer Vision (ICCV), Seoul, Korea (South), October 27, 2019 (IEEE Press, New York, 2019), pp. 4321-4330.
doi 10.1109/ICCV.2019.00442
- F. Kluger and B. Rosenhahn, “PARSAC: Accelerating Robust Multi-Model Fitting with Parallel Sample Consensus,” in Proc. AAAI Conf. on Artificial Intelligence , 38 (3), 2804-2812 (2024).
doi 10.1609/aaai.v38i3.28060
- J. Chen, Y. Gu, and L. Luo, “Learning to Find Good Correspondences Based on Global and Local Attention Mechanism,” in Proc. 2021 China Automation Congress (CAC), Beijing, China, October 22-24, 2021 (IEEE Press, New York, 2022), pp. 2174-2178.
doi 10.1109/CAC53003.2021.9728340
- W. Sun, W. Jiang, E. Trulls, et al., “ACNe: Attentive Context Normalization for Robust Permutation-Equivariant Learning,” in Proc. IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, June 13-19, 2020 (IEEE Press, New York, 2020), pp. 11283-11292.
doi 10.1109/CVPR42600.2020.01130
- I. Rocco, R. Arandjelovi’c, and J. Sivic, “Convolutional Neural Network Architecture for Geometric Matching,” IEEE Trans. Pattern Anal. Mach. Intell. 41 (11), 2553-2567 (2019).
doi 10.1109/TPAMI.2018.2865351
- I. Rocco, R. Arandjelovi’c, and J. Sivic, “End-to-End Weakly-Supervised Semantic Alignment,” in Proc. 2018 IEEE/CVF Conf. on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, June 18-23, 2018 (IEEE Press, New York, 2018), pp. 6917-6925.
doi 10.1109/CVPR.2018.00723
- J. Lee, C. Jung, C. Kim, and A. Said, “Content-Based Pseudoscopic View Detection,” J. Sign. Process. Syst. 68 (2), 261-271 (2012).
doi 10.1007/s11265-011-0608-8
- G. Palou and P. Salembier, “Monocular Depth Ordering Using T-Junctions and Convexity Occlusion Cues,” IEEE Trans. Image Process. 22 (5), 1926-1939 (2013).
doi 10.1109/TIP.2013.2240002
- H. Lee, C. Jung, and C. Kim, “Depth Map Estimation Based on Geometric Scene Categorization,” in Proc. 19th Korea-Japan Joint Workshop on Frontiers of Computer Vision (FCV), Incheon, Korea (South), January 30, 2013 , (IEEE Press, New York, 2013), pp. 170-173.
doi 10.1109/FCV.2013.6485482
- D. Hoiem, A. N. Stein, A. A. Efros, and M. Hebert, “Recovering Occlusion Boundaries from a Single Image,” in Proc 2007 IEEE 11th Int. Conf. on Computer Vision, Rio de Janeiro, Brazil, October 14-21, 2007 (IEEE Press, New York, 2007), pp. 1-8.
doi 10.1109/ICCV.2007.4408985
- H. Fu, M. Gong, C. Wang, et al., “Deep Ordinal Regression Network for Monocular Depth Estimation,”
doi 10.1109/CVPR.2018.00214
https://arxiv.org/abs/1806.02446 . Cited August 29, 2025.
- C. Godard, O. Mac Aodha, M. Firman, and G. Brostow, “Digging into Self-Supervised Monocular Depth Estimation,” in Proc. IEEE/CVF Int. Conf. on Computer Vision (ICCV), Seoul, Korea (South), October 27-November 2 2019 (IEEE Press, New York, 2019), pp. 3827-3837.
doi 10.1109/ICCV.2019.00393
- R. Ranftl, K. Lasinger, D. Hafner, et al., “Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-Shot Cross-Dataset Transfer,” IEEE Trans. Pattern Anal. Mach. Intell. 44 (3), 1623-1637 (2022).
doi 10.1109/TPAMI.2020.3019967
- W. Yin, J. Zhang, O. Wang, et al., “Learning to Recover 3D Scene Shape from a Single Image,” in Proc. 2021 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, June 20-25, 2021 (IEEE Press, New York, 2021), pp. 204-213.
doi 10.1109/CVPR46437.2021.00027
- L. Yang, B. Kang, Z. Huang, et al., “Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data,” in Proc. IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, June 16-22, 2024 (IEEE Press, New York, 2024), pp. 10371-10381.
doi 10.1109/CVPR52733.2024.00987
- M. Knee, “Getting Machines to Watch 3D for You,” SMPTE Motion Imaging J. 121 (3), 52-58 (2012).
doi 10.5594/j18162
- J. Bouchard, Y. Nazzar, and J. J. Clark, “Half-Occluded Regions and Detection of Pseudoscopy,” in Proc. 2015 Int. Conf. on 3D Vision (3DV), Lyon, France, October 19-22, 2015 (IEEE Press, New York, 2015), pp. 215-223.
doi 10.1109/3DV.2015.32
- A. Shestov, A. Voronov, and D. Vatolin, “Detection of Swapped Views in Stereo Image,” in Proc. 22nd GraphiCon Int. Conf. on Computer Graphics and Vision , pp. 23-27 (2012).
- K. Simonyan, S. Grishin, D. Vatolin, and D. Popov, “Fast Video Super-Resolution via Classification,” in Proc. 2008 15th IEEE Int. Conf. on Image Processing (ICIP), San Diego, CA, USA, October 12-15, 2008 (IEEE Press, New York, 2008), pp. 349-352.
doi 10.1109/ICIP.2008.4711763
- G. Egnal and R. P. Wildes, “Detecting Binocular Half-Occlusions: Empirical Comparisons of Five Approaches,” IEEE Trans. Pattern Anal. Mach. Intell. 24 (8), 1127-1133 (2002).
doi 10.1109/TPAMI.2002.1023808
- D. Fourure, R. Emonet, E. Fromont, et al., “Residual Conv-Deconv Grid Network for Semantic Segmentation,”
https://arxiv.org/abs/1707.07958 . Cited August 29, 2025.
- K. He, X. Zhang, S. Ren, and J. Sun, “Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification,” in Proc. 2015 IEEE Int. Conf. on Computer Vision (ICCV), Santiago, Chile, December 7-13, 2015 (IEEE Press, New York, 2016), pp. 1026-1034.
doi 10.1109/ICCV.2015.123
- D. Min, S. Choi, J. Lu, et al., “Fast Global Image Smoothing Based on Weighted Least Squares,” IEEE Trans. Image Process. 23 (12), 5638-5653 (2014).
doi 10.1109/TIP.2014.2366600
- X. Glorot and Y. Bengio, “Understanding the Difficulty of Training Deep Feedforward Neural Networks,” in Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS) , pp. 249-256 (2010).
- D. P. Kingma and J. L. Ba, “Adam: A Method for Stochastic Optimization,”
https://arxiv.org/pdf/1412.6980 . Cited August 29, 2025.
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” in Proc. 2016 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, June 27-30, 2016 (IEEE Press, New York, 2016), pp. 770-778.
doi 10.1109/CVPR.2016.90
- S. Ioffe and C. Szegedy, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift,”
https://arxiv.org/abs/1502.03167 . Cited August 29, 2025.
- D. S. Vatolin and S. V. Lavrushkin, “Investigating and Predicting the Perceptibility Protect of Channel Mismatch in Stereoscopic Video,” Vestn. Mosk. Univ., Ser. 15: Vychisl. Mat. Kibern., No. 4, 40-46 (2016) [Moscow Univ. Comput. Math. Cybern. 40 (4), 185-191 (2016)].
doi 10.3103/S0278641916040075
- D. J. Butler, J. Wulff, G. B. Stanley, and M. J. Black, “A Naturalistic Open Source Movie for Optical Flow Evaluation,” in Lecture Notes in Computer Science (Springer, Berlin, 2012), Vol. 7577, pp. 611-625.
doi 10.1007/978-3-642-33783-3_44