A Feature Engineering Approach to Improve Clustering-Based Persona Generation

Conference proceedings article


Authors/Editors


Strategic Research Themes


Publication Details

Author listWongabut, T., Ninrutsirikun, U., Nukoolkit C., Lavangnananda, P., Warasup K., and Arpnikanondt, C.

Publication year2025

Start page111

End page117

Number of pages7

URLhttps://www.seai.org/

LanguagesEnglish-United States (EN-US)


Abstract

Personas are widely recognized as essential tools in user-centered design and human-computer interaction, enabling designers to deeply understand target users’ behaviors, goals, and needs. With the increasing complexity and scale of digital systems,automated  persona  generation  has  emerged  as  a  promising solution  to streamline persona creation by leveraging user data,clustering  algorithms,  and  large  language  models.  Despite  its potential,  current  methods  face  several  limitations,  including inadequate  feature  engineering,  a  lack  of  context-specific customization, and limited validation of persona relevance in real-world applications. This study aims to enhance the effectiveness of automated persona generation within  the context of educational digital  services  by  proposing  a  feature  engineering-driven clustering  approach.  Using  K-means  clustering  combined  with dimension-based  feature  construction,  we  evaluate  the  clusters through  silhouette  analysis  and  assess  the  quality  of  personas based  on  cluster  representativeness.  The  results  demonstrate improved  clustering  cohesion  and  more  representative  persona profiles  compared  to  baseline  methods.  The  study  contributes  a structured  methodology  for  generating  data-driven  personas tailored to educational environments, which benefits UX designers.
However, limitations include the reliance on survey-based datasets and  the  scope  confined  to  higher  education in  Thailand.  Future research  will  explore  the  generalizability  of  the  proposed approach  across  different  domains,  conduct  cross-cultural validation  of  persona  models,  further  assess  persona  quality through expert evaluations, and investigate the use of alternative large  language  models  to  enhance  the  quality  and  relevance  of generated personas.


Keywords

ChatGPTData clusteringHuman-Computer InteractionLarge language modelsUser Persona


Last updated on 2025-18-07 at 18:05