pith. machine review for the scientific record. sign in

arxiv: 2604.23953 · v1 · submitted 2026-04-27 · 💻 cs.CV · cs.AI

Recognition: unknown

Viewport-Unaware Blind Omnidirectional Image Quality Assessment: A Unified and Generalized Approach

Authors on Pith no claims yet

Pith reviewed 2026-05-08 04:59 UTC · model grok-4.3

classification 💻 cs.CV cs.AI
keywords blind omnidirectional image quality assessmentblind image quality assessmentequirectangular projectionviewport generationunified assessmentgeneralizabilitycross-database validation
0
0 comments X

The pith

Blind omnidirectional image quality assessment reduces to standard blind 2D image quality assessment by skipping viewport generation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that quality prediction for omnidirectional images can proceed directly from their equirectangular format using techniques built for ordinary planar images. This removes the separate step of generating simulated user viewports, which had added computation and limited how well methods transferred to new content. A single model is shown to handle both omnidirectional and planar cases while delivering stronger results on held-out and cross-database tests. If correct, the work collapses a specialized subfield into the broader blind image quality assessment literature and simplifies deployment.

Core claim

The authors experimentally demonstrate that blind omnidirectional image quality assessment can be formulated as a blind planar image quality assessment problem, making viewport generation unnecessary. They introduce a viewport-unaware model that accepts equirectangular projections directly, applies to both omnidirectional and planar images, and exhibits improved generalizability over prior competitors, as confirmed through held-out testing, cross-database validation, and gMAD competition.

What carries the argument

Viewport-unaware unified model that takes equirectangular projection input and performs quality prediction without any format transformation or viewport simulation.

If this is right

  • Quality assessment pipelines for omnidirectional images lose one processing stage and its associated parameters.
  • The same trained network can be applied unchanged to both spherical and flat images.
  • Cross-dataset performance improves because the model no longer depends on viewport sampling choices.
  • Computational cost at inference time decreases by eliminating viewport rendering.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Viewing behavior diversity may matter less for quality prediction than the field previously assumed.
  • The same direct-input principle could be tested on other spherical media such as 360-degree video or light-field data.
  • If the unification holds, existing large-scale 2D image quality datasets could be used to pre-train omnidirectional models.

Load-bearing premise

That the observed equivalence between omnidirectional and planar quality assessment without viewport generation continues to hold for all possible image content, viewing patterns, and future datasets.

What would settle it

A new omnidirectional image database where human quality scores correlate significantly better with a viewport-based predictor than with any direct equirectangular predictor on the same images.

Figures

Figures reproduced from arXiv: 2604.23953 by Jiayu Zhang, Jiebin Yan, Jingwen Hou, Kangcheng Wu, Pengfei Chen, Yuming Fang.

Figure 1
Figure 1. Figure 1: The performance comparison of BOIQA models (MC360IQA [8], view at source ↗
Figure 2
Figure 2. Figure 2: The framework of the proposed VUGA. It consists of four parts: (a) backbone with frozen parameters for extracting hierarchical features, (b) cross view at source ↗
Figure 3
Figure 3. Figure 3: The geometry deformation in omnidirectional images and the view at source ↗
Figure 4
Figure 4. Figure 4: The architecture of spatial distortion-aware attention module. The view at source ↗
Figure 5
Figure 5. Figure 5: The architecture of channel-aware enhancement module. The 5 view at source ↗
Figure 6
Figure 6. Figure 6: gMAD competition results between MTAOIQA [32] and VUGA. The MOS of each image is shown in the bracket. (a) Fixed MTAOIQA at the view at source ↗
Figure 7
Figure 7. Figure 7: gMAD competition results between OIQAND [10] and VUGA. The MOS of each image is shown in the bracket. (a) Fixed OIQAND at the low-quality view at source ↗
read the original abstract

Blind omnidirectional image quality assessment (BOIQA) presents a great challenge to the visual quality assessment community, due to different storage formats and diverse user viewing behaviors. The main paradigm of BOIQA models includes two steps, ie, viewport generation, and quality prediction, which brings an extra computational burden and is hard to generalize to other visual contents (eg, 2D planar image). Thus, in this paper, we make an attempt to solve these issues. First, we experimentally find that BOIQA can be formulated as a blind (2D planar) image quality assessment (BIQA) problem, ie, the first step - viewport generation - is no longer needed, which narrows the natural gap between BOIQA and BIQA. Then, we present a new BOIQA approach, which has three merits: ie, viewport-unaware - it accepts an omnidirectional image in the widely used equirectangular projection format as input without any transformation; unified - it can also be applied to BIQA; and generalized - it shows better generalizability against other competitors. Finally, we validate its promise by held-out test, cross-database validation, and the well-established gMAD competition.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript claims that blind omnidirectional image quality assessment (BOIQA) can be reformulated as a standard blind image quality assessment (BIQA) problem by directly processing equirectangular projection (ERP) images without viewport generation. It introduces a unified, viewport-unaware model that accepts ERP inputs, applies to both BOIQA and BIQA, and shows improved generalizability, supported by held-out tests, cross-database validation, and gMAD competition results.

Significance. If the empirical finding that quality-relevant features are preserved and learnable directly from ERP holds beyond the tested regimes, the result would simplify BOIQA pipelines by eliminating viewport simulation and user-behavior modeling, enabling a single model for planar and omnidirectional content. This could reduce computational overhead and improve generalization in 360° image assessment, with the unified formulation offering a practical bridge between the two subfields.

major comments (2)
  1. [§4] §4 (Experiments): The central claim that BOIQA reduces to BIQA without viewport generation rests on the held-out, cross-database, and gMAD results. These sections must explicitly report the degree of scene-content overlap, distortion-pipeline similarity, and viewing-angle coverage between training and test sets; without such controls, the equivalence may be an artifact of the evaluation regime rather than a general property.
  2. [§3] §3 (Proposed Method): The architecture is described as accepting raw ERP input, yet the manuscript provides no analysis (geometric or ablation) showing that spherical distortions are handled implicitly by the network rather than by learned approximations to viewport mapping. This detail is load-bearing for the 'viewport-unaware' claim.
minor comments (2)
  1. [Abstract] Abstract: The phrase 'we experimentally find' is used without any quantitative performance deltas or baseline comparisons; adding one or two key metrics would make the summary more informative.
  2. [Introduction] Notation: ERP is introduced without an explicit definition on first use; a brief parenthetical expansion would aid readers outside the omnidirectional imaging community.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments, which help clarify the presentation of our central claims. We address each major comment below and indicate the revisions we will make to the manuscript.

read point-by-point responses
  1. Referee: [§4] §4 (Experiments): The central claim that BOIQA reduces to BIQA without viewport generation rests on the held-out, cross-database, and gMAD results. These sections must explicitly report the degree of scene-content overlap, distortion-pipeline similarity, and viewing-angle coverage between training and test sets; without such controls, the equivalence may be an artifact of the evaluation regime rather than a general property.

    Authors: We agree that explicit quantification of these factors strengthens the interpretation of the results. In the revised manuscript we will add a dedicated paragraph (or short subsection) in §4 that reports: (i) scene-content overlap via dataset-level descriptions and, where feasible, semantic similarity statistics between training and test splits; (ii) distortion-pipeline similarity, noting that cross-database experiments already employ entirely independent acquisition and distortion pipelines; and (iii) viewing-angle coverage for the held-out and gMAD protocols. These additions will make clear that the observed equivalence is not an artifact of uncontrolled overlap. revision: yes

  2. Referee: [§3] §3 (Proposed Method): The architecture is described as accepting raw ERP input, yet the manuscript provides no analysis (geometric or ablation) showing that spherical distortions are handled implicitly by the network rather than by learned approximations to viewport mapping. This detail is load-bearing for the 'viewport-unaware' claim.

    Authors: The viewport-unaware claim rests on the fact that the model receives unaltered ERP images and performs no explicit viewport sampling or spherical-to-planar transformation at inference time. While the original submission did not contain a dedicated geometric analysis or ablation isolating implicit handling of spherical distortion, the empirical evidence—particularly the unified performance on both omnidirectional and planar content together with strong cross-database and gMAD results—indicates that the network learns the necessary mappings directly from ERP. In the revision we will insert a concise discussion of this point and add a targeted ablation that compares the current architecture against variants that explicitly incorporate viewport-like operations, thereby providing the requested supporting analysis. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected; core equivalence claim is an empirical observation, not a derivation that reduces to inputs or self-citations.

full rationale

The paper states its central finding as an experimental observation: 'we experimentally find that BOIQA can be formulated as a blind (2D planar) image quality assessment (BIQA) problem, ie, the first step - viewport generation - is no longer needed.' This is supported by held-out tests, cross-database validation, and gMAD competition rather than any parameter-free derivation, geometric proof, or self-referential equations. No load-bearing steps reduce by construction to fitted parameters, self-citations, or ansatzes imported from prior author work. The model is presented as viewport-unaware and unified, with generalization claims resting on empirical results across datasets, not on renaming or self-definitional structures. The derivation chain is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit mathematical formulation, so no free parameters, axioms, or invented entities can be identified; the contribution is framed as an empirical reformulation rather than a derivation.

pith-pipeline@v0.9.0 · 5527 in / 1069 out tokens · 54647 ms · 2026-05-08T04:59:44.321946+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

59 extracted references · 2 canonical work pages · 1 internal anchor

  1. [1]

    Begin with the end in mind: A unified end- to-end quality-of-experience monitoring, optimization and management framework,

    Z. Wang and A. Rehman, “Begin with the end in mind: A unified end- to-end quality-of-experience monitoring, optimization and management framework,” inSMPTE Annual Technical Conference and Exhibition, 2017, pp. 1–11

  2. [2]

    No reference quality assessment for screen content images with both local and global feature representation,

    Y . Fang, J. Yan, L. Li, J. Wu, and W. Lin, “No reference quality assessment for screen content images with both local and global feature representation,”IEEE Transactions on Image Processing, vol. 27, no. 4, pp. 1600–1610, 2017

  3. [3]

    Objective quality assessment of screen content images by uncertainty weighting,

    Y . Fang, J. Yan, J. Liu, S. Wang, Q. Li, and Z. Guo, “Objective quality assessment of screen content images by uncertainty weighting,”IEEE Transactions on Image Processing, vol. 26, no. 4, pp. 2016–2027, 2017

  4. [4]

    No reference quality assessment for 3D synthesized views by local structure variation and global naturalness change,

    J. Yan, Y . Fang, R. Du, Y . Zeng, and Y . Zuo, “No reference quality assessment for 3D synthesized views by local structure variation and global naturalness change,”IEEE Transactions on Image Processing, vol. 29, pp. 7443–7453, 2020

  5. [5]

    Perceptual quality as- sessment of smartphone photography,

    Y . Fang, H. Zhu, Y . Zeng, K. Ma, and Z. Wang, “Perceptual quality as- sessment of smartphone photography,” inIEEE Conference on Computer Vision and Pattern Recognition, 2020, pp. 3677–3686

  6. [6]

    A survey on recent advanced in video quality assessment,

    J. Yan, Y . Fang, Y . Yao, and X. Sui, “A survey on recent advanced in video quality assessment,”Chinese Journal of Computers, vol. 46, no. 10, pp. 2196–2224, 2023

  7. [7]

    Towards scalable and efficient full-reference omnidirectional image quality assessment,

    J. Yan, Z. Liu, Z. Wang, Y . Fang, and H. Liu, “Towards scalable and efficient full-reference omnidirectional image quality assessment,”IEEE Signal Processing Letters, vol. 32, pp. 2459 – 2463, 2025

  8. [8]

    MC360IQA: A multi-channel CNN for blind 360-degree image quality assessment,

    W. Sun, X. Min, G. Zhai, K. Gu, H. Duan, and S. Ma, “MC360IQA: A multi-channel CNN for blind 360-degree image quality assessment,” IEEE Journal of Selected Topics in Signal Processing, vol. 14, no. 1, pp. 64–77, 2019

  9. [9]

    Assessor360: Multi-sequence network for blind omnidirectional image quality assessment,

    T. Wu, S. Shi, H. Cai, M. Cao, J. Xiao, Y . Zheng, and Y . Yang, “Assessor360: Multi-sequence network for blind omnidirectional image quality assessment,” inAdvances in Neural Information Processing Systems, 2024, pp. 64 957–64 970

  10. [10]

    Subjective and objective quality assessment of non-uniformly distorted omnidirectional images,

    J. Yan, J. Rao, X. Liu, Y . Fang, Y . Zuo, and W. Liu, “Subjective and objective quality assessment of non-uniformly distorted omnidirectional images,”IEEE Transactions on Multimedia, vol. 27, pp. 2695–2707, 2025

  11. [11]

    Viewport- unaware blind omnidirectional image quality assessment: A flexible and effective paradigm,

    J. Yan, K. Wu, J. Chen, Z. Tan, Y . Fang, and W. Liu, “Viewport- unaware blind omnidirectional image quality assessment: A flexible and effective paradigm,”ACM Transactions on Multimedia Computing, Communications, and Applications, vol. 21, no. 5, pp. 1551–6857, 2025

  12. [12]

    VCRNet: Visual compensation restoration network for no-reference image quality assessment,

    Z. Pan, F. Yuan, J. Lei, Y . Fang, X. Shao, and S. Kwong, “VCRNet: Visual compensation restoration network for no-reference image quality assessment,”IEEE Transactions on Image Processing, vol. 31, pp. 1613– 1627, 2022

  13. [13]

    TOPIQ: A top-down approach from semantics to distortions for image quality assessment,

    C. Chen, J. Mo, J. Hou, H. Wu, L. Liao, W. Sun, Q. Yan, and W. Lin, “TOPIQ: A top-down approach from semantics to distortions for image quality assessment,”IEEE Transactions on Image Processing, vol. 33, pp. 2404–2418, 2024

  14. [14]

    Blindly assess image quality in the wild guided by a self-adaptive hyper network,

    S. Su, Q. Yan, Y . Zhu, C. Zhang, X. Ge, J. Sun, and Y . Zhang, “Blindly assess image quality in the wild guided by a self-adaptive hyper network,” inIEEE Conference on Computer Vision and Pattern Recognition, 2020, pp. 3667–3676

  15. [15]

    Spherical structural similarity index for objective omnidirectional video quality assessment,

    S. Chen, Y . Zhang, Y . Li, Z. Chen, and Z. Wang, “Spherical structural similarity index for objective omnidirectional video quality assessment,” inIEEE International Conference on Multimedia and Expo, 2018, pp. 1–6

  16. [16]

    Weighted-to-spherically- uniform ssim objective quality evaluation for panoramic video,

    Y . Zhou, M. Yu, H. Ma, H. Shao, and G. Jiang, “Weighted-to-spherically- uniform ssim objective quality evaluation for panoramic video,” inIEEE International Conference on Signal Processing, 2018, pp. 54–57

  17. [17]

    Quality metric for spherical panoramic video,

    V . Zakharchenko, K. P. Choi, and J. H. Park, “Quality metric for spherical panoramic video,” inOptics and Photonics for Information Processing X, vol. 9970, 2016, pp. 57–65

  18. [18]

    Weighted-to-spherically-uniform quality evaluation for omnidirectional video,

    Y . Sun, A. Lu, and L. Yu, “Weighted-to-spherically-uniform quality evaluation for omnidirectional video,”IEEE Signal Processing Letters, vol. 24, no. 9, pp. 1408–1412, 2017

  19. [19]

    Image quality assessment: From error visibility to structural similarity,

    Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: From error visibility to structural similarity,”IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600–612, 2004

  20. [20]

    Percep- tual quality assessment of omnidirectional images: A benchmark and computational model,

    X. Liu, J. Yan, L. Huang, Y . Fang, Z. Wan, and Y . Liu, “Percep- tual quality assessment of omnidirectional images: A benchmark and computational model,”ACM Transactions on Multimedia Computing, Communications and Applications, vol. 20, no. 6, 2024

  21. [21]

    Perceptual quality assessment of omnidirectional images as moving camera videos,

    X. Sui, K. Ma, Y . Yao, and Y . Fang, “Perceptual quality assessment of omnidirectional images as moving camera videos,”IEEE Transactions on Visualization and Computer Graphics, vol. 28, no. 8, pp. 3022–3034, 2021

  22. [22]

    ScanDMM: A deep markov model of scanpath prediction for 360° images,

    X. Sui, Y . Fang, H. Zhu, S. Wang, and Z. Wang, “ScanDMM: A deep markov model of scanpath prediction for 360° images,” inIEEE Conference on Computer Vision and Pattern Recognition, 2023, pp. 6989–6999

  23. [23]

    MFAN: A multi-projection fusion attention network for no-reference and full-reference panoramic image quality assessment,

    H. Li and X. Zhang, “MFAN: A multi-projection fusion attention network for no-reference and full-reference panoramic image quality assessment,”IEEE Signal Processing Letters, vol. 30, pp. 1207–1211, 2023

  24. [24]

    Cubemap-based perception-driven blind quality assess- ment for 360-degree images,

    H. Jiang, G. Jiang, M. Yu, Y . Zhang, Y . Yang, Z. Peng, F. Chen, and Q. Zhang, “Cubemap-based perception-driven blind quality assess- ment for 360-degree images,”IEEE Transactions on Image Processing, vol. 30, pp. 2364–2377, 2021

  25. [25]

    Spherical pseudo-cylindrical representation for omnidirectional image super-resolution,

    Q. Cai, M. Li, D. Ren, J. Lyu, H. Zheng, J. Dong, and Y .-H. Yang, “Spherical pseudo-cylindrical representation for omnidirectional image super-resolution,” inAAAI Conference on Artificial Intelligence, vol. 38, no. 2, 2024, pp. 873–881

  26. [26]

    Toward generalized image quality assessment: Relaxing the perfect reference quality assumption,

    D. Chen, T. Wu, K. Ma, and L. Zhang, “Toward generalized image quality assessment: Relaxing the perfect reference quality assumption,” inIEEE Conference on Computer Vision and Pattern Recognition, 2025, pp. 12 742–12 752

  27. [27]

    Convolutional neural networks for no-reference image quality assessment,

    L. Kang, P. Ye, Y . Li, and D. Doermann, “Convolutional neural networks for no-reference image quality assessment,” inIEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 1733–1740

  28. [28]

    UNI-IQA: A unified approach for mutual promotion of natural and screen content image quality assessment,

    M. Song, C. Chen, W. Song, and Y . Fang, “UNI-IQA: A unified approach for mutual promotion of natural and screen content image quality assessment,”IEEE Transactions on Circuits and Systems for Video Technology, pp. 1–1, 2025

  29. [29]

    A framework to evaluate omnidi- rectional video coding schemes,

    M. Yu, H. Lakshman, and B. Girod, “A framework to evaluate omnidi- rectional video coding schemes,” inIEEE International Symposium on Mixed and Augmented Reality, 2015, pp. 31–36

  30. [30]

    Blind omnidirectional image quality assessment with viewport oriented graph convolutional networks,

    J. Xu, W. Zhou, and Z. Chen, “Blind omnidirectional image quality assessment with viewport oriented graph convolutional networks,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 31, no. 5, pp. 1724–1737, 2020

  31. [31]

    Perceptual quality assessment of omnidirectional images,

    Y . Fang, L. Huang, J. Yan, X. Liu, and Y . Liu, “Perceptual quality assessment of omnidirectional images,” inAAAI Conference on Artificial Intelligence, vol. 36, no. 1, 2022, pp. 580–588

  32. [32]

    Multitask auxiliary network for perceptual quality assessment of non-uniformly distorted omnidirectional images,

    J. Yan, J. Rao, J. Chen, Z. Tan, W. Liu, and Y . Fang, “Multitask auxiliary network for perceptual quality assessment of non-uniformly distorted omnidirectional images,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 35, no. 3, pp. 2782–2793, 2025

  33. [33]

    Swin transformer v2: Scaling up capacity and resolution,

    Z. Liu, H. Hu, Y . Lin, Z. Yao, Z. Xie, Y . Wei, J. Ning, Y . Cao, Z. Zhang, L. Donget al., “Swin transformer v2: Scaling up capacity and resolution,” inIEEE Conference on Computer Vision and Pattern Recognition, 2022, pp. 12 009–12 019. 11

  34. [34]

    Deformable ConvNets v2: More deformable, better results,

    X. Zhu, H. Hu, S. Lin, and J. Dai, “Deformable ConvNets v2: More deformable, better results,” inIEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 9308–9316

  35. [35]

    MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

    A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “MobileNets: Efficient convo- lutional neural networks for mobile vision applications,”arXiv preprint arXiv:1704.04861, 2017

  36. [36]

    Deep residual learning for image recognition,

    K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” inIEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778

  37. [37]

    Quality-aware image-text alignment for opinion-unaware image quality assessment.arXiv preprint arXiv:2403.11176,

    L. Agnolucci, L. Galteri, and M. Bertini, “Quality-aware image- text alignment for opinion-unaware image quality assessment,”arXiv preprint arXiv:2403.11176, 2024

  38. [38]

    Max360IQ: Blind omnidirectional image quality assessment with multi-axis attention,

    J. Yan, Z. Tan, Y . Fang, j. Rao, and Y . Zuo, “Max360IQ: Blind omnidirectional image quality assessment with multi-axis attention,” Pattern Recognition, vol. 162, p. 111429, 2025

  39. [39]

    A large-scale com- pressed 360-degree spherical image database: From subjective quality evaluation to objective model comparison,

    W. Sun, K. Gu, S. Ma, W. Zhu, N. Liu, and G. Zhai, “A large-scale com- pressed 360-degree spherical image database: From subjective quality evaluation to objective model comparison,” inIEEE 20th International Workshop on Multimedia Signal Processing, 2018, pp. 1–6

  40. [40]

    Perceptual quality assessment of omnidirectional images,

    H. Duan, G. Zhai, X. Min, Y . Zhu, Y . Fang, and X. Yang, “Perceptual quality assessment of omnidirectional images,” inIEEE International Symposium on Circuits and Systems, 2018, pp. 1–5

  41. [41]

    Spatial attention-based non-reference perceptual quality prediction network for omnidirectional images,

    L. Yang, M. Xu, X. Deng, and B. Feng, “Spatial attention-based non-reference perceptual quality prediction network for omnidirectional images,” inIEEE International Conference on Multimedia and Expo, 2021, pp. 1–6

  42. [42]

    Attentive deep image quality assessment for omnidirectional stitching,

    H. Duan, X. Min, W. Sun, Y . Zhu, X.-P. Zhang, and G. Zhai, “Attentive deep image quality assessment for omnidirectional stitching,”IEEE Journal of Selected Topics in Signal Processing, vol. 17, no. 6, pp. 1150–1164, 2023

  43. [43]

    Omnidi- rectional image quality captioning: A large-scale database and a new model,

    J. Yan, Z. Tan, Y . Fang, J. Chen, W. Jiang, and Z. Wang, “Omnidi- rectional image quality captioning: A large-scale database and a new model,”IEEE Transactions on Image Processing, vol. 34, pp. 1326– 1339, 2024

  44. [44]

    AIGCOIQA2024: Perceptual quality assessment of ai generated omnidirectional images,

    L. Yang, H. Duan, L. Teng, Y . Zhu, X. Liu, M. Hu, X. Min, G. Zhai, and P. Le Callet, “AIGCOIQA2024: Perceptual quality assessment of ai generated omnidirectional images,” inIEEE International Conference on Image Processing, 2024, pp. 1239–1245

  45. [45]

    KADID-10k: A large-scale artificially distorted IQA database,

    H. Lin, V . Hosu, and D. Saupe, “KADID-10k: A large-scale artificially distorted IQA database,” in2019 Eleventh International Conference on Quality of Multimedia Experience, 2019, pp. 1–3

  46. [46]

    Color image database TID2013: Peculiarities and preliminary results,

    N. Ponomarenko, O. Ieremeiev, V . Lukin, K. Egiazarian, L. Jin, J. Astola, B. V ozel, K. Chehdi, M. Carli, F. Battisti, and C.-C. J. Kuo, “Color image database TID2013: Peculiarities and preliminary results,” inEuropean Workshop on Visual Information Processing, 2013, pp. 106–111

  47. [47]

    A statistical evaluation of recent full reference image quality assessment algorithms,

    H. Sheikh, M. Sabir, and A. Bovik, “A statistical evaluation of recent full reference image quality assessment algorithms,”IEEE Transactions on Image Processing, vol. 15, no. 11, pp. 3440–3451, 2006

  48. [48]

    KonIQ-10k: An ecologi- cally valid database for deep learning of blind image quality assessment,

    V . Hosu, H. Lin, T. Sziranyi, and D. Saupe, “KonIQ-10k: An ecologi- cally valid database for deep learning of blind image quality assessment,” IEEE Transactions on Image Processing, vol. 29, pp. 4041–4056, 2020

  49. [49]

    Final report from the video quality experts group on the validation of objective models of video quality assessment,

    VQEG, “Final report from the video quality experts group on the validation of objective models of video quality assessment,” http://www. vqeg.org, 2000, accessed: 2021-06-17

  50. [50]

    ImageNet: A large-scale hierarchical image database,

    J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “ImageNet: A large-scale hierarchical image database,” inIEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 248–255

  51. [51]

    Blind image quality assessment using a deep bilinear convolutional neural network,

    W. Zhang, K. Ma, J. Yan, D. Deng, and Z. Wang, “Blind image quality assessment using a deep bilinear convolutional neural network,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 1, pp. 36–47, 2020

  52. [52]

    MUSIQ: Multi-scale image quality transformer,

    J. Ke, Q. Wang, Y . Wang, P. Milanfar, and F. Yang, “MUSIQ: Multi-scale image quality transformer,” inIEEE Conference on Computer Vision and Pattern Recognition, 2021, pp. 5148–5157

  53. [53]

    Data- efficient image quality assessment with attention-panel decoder,

    G. Qin, R. Hu, Y . Liu, X. Zheng, H. Liu, X. Li, and Y . Zhang, “Data- efficient image quality assessment with attention-panel decoder,” in AAAI Conference on Artificial Intelligence, vol. 37, no. 2, 2023, pp. 2091–2100

  54. [54]

    Blind image quality assessment via vision-language correspondence: A multitask learning perspective,

    W. Zhang, G. Zhai, Y . Wei, X. Yang, and K. Ma, “Blind image quality assessment via vision-language correspondence: A multitask learning perspective,” inIEEE Conference on Computer Vision and Pattern Recognition, 2023, pp. 14 071–14 081

  55. [55]

    Boosting image quality assessment through efficient transformer adaptation with local feature enhancement,

    K. Xu, L. Liao, J. Xiao, C. Chen, H. Wu, Q. Yan, and W. Lin, “Boosting image quality assessment through efficient transformer adaptation with local feature enhancement,” inIEEE Conference on Computer Vision and Pattern Recognition, 2024, pp. 2662–2672

  56. [56]

    Computational analysis of degradation modeling in blind panoramic image quality assessment,

    J. Yan, Z. Tan, J. Rao, L. Wu, Y . Zuo, and Y . Fang, “Computational analysis of degradation modeling in blind panoramic image quality assessment,”ACM Transactions on Multimedia Computing, Communi- cations, and Applications, vol. 21, no. 5, pp. 1–23, 2025

  57. [57]

    Group maximum differentiation competition: Model comparison with few samples,

    K. Ma, Z. Duanmu, Z. Wang, Q. Wu, W. Liu, H. Yong, H. Li, and L. Zhang, “Group maximum differentiation competition: Model comparison with few samples,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 42, no. 4, pp. 851–864, 2018

  58. [58]

    Maximum differentiation (MAD) competition: A methodology for comparing computational models of perceptual quantities,

    Z. Wang and E. P. Simoncelli, “Maximum differentiation (MAD) competition: A methodology for comparing computational models of perceptual quantities,”Journal of Vision, vol. 8, no. 12, pp. 8–8, 2008

  59. [59]

    Exposing semantic segmentation failures via maximum discrepancy competition,

    J. Yan, Y . Zhong, Y . Fang, Z. Wang, and K. Ma, “Exposing semantic segmentation failures via maximum discrepancy competition,”Interna- tional Journal of Computer Vision, vol. 129, pp. 1768–1786, 2021. Jiebin Yanreceived the Ph.D. degree from the Jiangxi University of Finance and Economics, Nan- chang, China. He was a Computer Vision Engineer with MTlab, ...