A Web Demo Interface for Explainable Image Aesthetic Evaluation Using Vision-Language Models

Conference proceedings article

Authors/Editors

SUPATTA VIRIYAVISUTHISAKUL

Strategic Research Themes

Modeling Design and Optimization (Computational Science and Engineering)

Publication Details

Author list: Viriyavisuthisakul, S.; Yoshida, S.; Sanguansat, P.; Yamasaki, T.

Publisher: Institute of Electrical and Electronics Engineers Inc.

Publication year: 2025

Start page: 547

End page: 550

Number of pages: 4

ISBN: 9798350351422; 9798331594657

ISSN: 27704327

URL: https://www.scopus.com/inward/record.uri?eid=2-s2.0-105025045597&doi=10.1109%2FMIPR67560.2025.00092&partnerID=40&md5=9b92bc8f9ee0e636dbf1ae9f4ce1fe86

Languages: English-Great Britain (EN-GB)

View on publisher site

Abstract

Image aesthetic assessment (IAA) is a technique for evaluating the aesthetics of images. It is a challenging task because predicting the aesthetic quality is subjective. To enable automated IAA, the machine needs to understand and explain aesthetic-related composition. Recently, CLIP-IQA was proposed to evaluate image quality based on aesthetic antonym prompt pairs. Although the model achieves a high correlation with human aesthetic judgment, the reasons behind these scores remain unclear. In this study, we propose the integration of frameworks to deeply analyze the features that influence the aesthetic score. To predict the quality score, Light Gradient Boosting Machine (LightGBM) is applied as a regressor. SHapley Additive exPlanations (SHAP) scores are used to evaluate the contribution of each targeted prompt pair. For generating linguistic explanations, multiple large language models (MLLMs) are applied. The results show that the correlation coefficient increases. Our demo system can work with any input images, displaying the SHAP value along with text explanations based on the features users focus on. © 2025 IEEE.

Keywords

No matching items found.