A Web Demo Interface for Explainable Image Aesthetic Evaluation Using Vision-Language Models
Conference proceedings article
Authors/Editors
Strategic Research Themes
Publication Details
Author list: Viriyavisuthisakul, S.; Yoshida, S.; Sanguansat, P.; Yamasaki, T.
Publisher: Institute of Electrical and Electronics Engineers Inc.
Publication year: 2025
Start page: 547
End page: 550
Number of pages: 4
ISBN: 9798350351422; 9798331594657
ISSN: 27704327
Languages: English-Great Britain (EN-GB)
Abstract
Image aesthetic assessment (IAA) is a technique for evaluating the aesthetics of images. It is a challenging task because predicting the aesthetic quality is subjective. To enable automated IAA, the machine needs to understand and explain aesthetic-related composition. Recently, CLIP-IQA was proposed to evaluate image quality based on aesthetic antonym prompt pairs. Although the model achieves a high correlation with human aesthetic judgment, the reasons behind these scores remain unclear. In this study, we propose the integration of frameworks to deeply analyze the features that influence the aesthetic score. To predict the quality score, Light Gradient Boosting Machine (LightGBM) is applied as a regressor. SHapley Additive exPlanations (SHAP) scores are used to evaluate the contribution of each targeted prompt pair. For generating linguistic explanations, multiple large language models (MLLMs) are applied. The results show that the correlation coefficient increases. Our demo system can work with any input images, displaying the SHAP value along with text explanations based on the features users focus on. © 2025 IEEE.
Keywords
No matching items found.






