Dividing a Screen Content Image (SCI) with complex components into pictorial and textual regions for predicting scores is one of the common Screen Content Image Quality Assessment (SCIQA) methods. However, how to efficiently leverage pictorial and textual features to predict quality scores for no-reference SCIQA still needs to be explored. In addition, statistical analysis reveals that labels of SCIs present a distribution. Therefore, both the distribution of quality scores of SCIQA and the distribution of labels need to be considered in the SCIQA. This paper proposes a no-reference SCIQA method unifying pictorial and textual features. One contribution is the proposed dual-branch extraction module with the parameter-free attention convolution block and the joint prediction module. The proposed method employs the dual-branch extraction module to generate efficient pictorial and textual features and then uses the joint prediction module to predict quality scores. Another contribution is the joint distribution loss. It makes the distribution of the quality scores as close as possible to the distribution of labels. Experiments on the SCIQA datasets show that the proposed method achieves excellent SCIQA performance and generalization ability.
This paper's license is marked as closed access or non-commercial and cannot be viewed on ResearchHub. Visit the paper's external site.