DISCOVERING LEARNER PERSONAS IN AI-ASSISTED ENGLISH LANGUAGE LEARNING USING COSINE-BASED CLUSTERING: IMPLICATIONS FOR PERSONALIZED SUPPORT IN GCC CONTEXTS
Main Article Content
Abstract
The rapid expansion of artificial intelligence (AI)-assisted English language learning tools has introduced substantial variability in learner outcomes due to differences in behavioural patterns, task engagement, and usage strategies among non-native learners, particularly within Omani educational contexts. This heterogeneity creates a methodological challenge in identifying consistent learner profiles without relying on predefined or subjective labels. This study investigates the effectiveness of unsupervised cosine-based clustering in identifying distinct learner personas in AI-assisted English learning environments. The study utilizes a dataset of 15,000 learner interaction records obtained from Kaggle, incorporating demographic attributes, behavioural features, task modalities, and learning outcome indicators. A structured experimental methodology is employed, beginning with baseline Euclidean K-Means clustering, followed by dimensionality reduction using Singular Value Decomposition (SVD), and subsequent clustering using cosine similarity across multiple algorithms, including K-Means, Gaussian Mixture Models, Agglomerative Clustering, and BIRCH. The results reveal that cosine-based K-Means clustering (k = 6) achieves a Silhouette Score of 0.678 compared to 0.10 for baseline Euclidean clustering, representing an absolute improvement of 0.578 and approximately a sixfold increase in clustering performance. Compared to SVD-based Euclidean clustering (Silhouette = 0.41), cosine similarity improves clustering effectiveness by approximately 65%, while the Davies–Bouldin Index decreases to 0.56 and the Calinski–Harabasz Index increases to 33,074. The findings indicate that cosine-based unsupervised modelling effectively identifies distinct learner personas, demonstrating that learning-gain variations are driven by behavioural interaction patterns rather than usage intensity alone.
JEL Classification Codes: G32, F65, L66, L25, M41.
Downloads
Article Details
Section

This work is licensed under a Creative Commons Attribution 4.0 International License.
How to Cite
References
Alanazi, M., Soh, B., Samra, H., & Li, A. (2025). The Influence of Artificial Intelligence Tools on Learning Outcomes in Computer Programming: A Systematic Review and Meta-Analysis. Computers, 14(5), 185. https://doi.org/10.3390/computers14050185
Albahli, S. (2025). Advancing Sustainable Educational Practices Through AI-Driven Prediction of Academic Outcomes. Sustainability, 17(3), 1087. https://doi.org/10.3390/su17031087
Aljehani, K., & Modiano, M. (2025). The impact of English medium instruction in the Gulf: A comparative study of KSA and UAE. Cogent Education, 12(1), 2479402. https://doi.org/10.1080/2331186X.2025.2479402
Arumugam, N., Rafik-Galea, S., Mello, G. D., & Dass, L. C. (2013). Cultural Influences on Group Learning in an ESL Classroom. Review of European Studies, 5(2), 81-89. https://doi.org/10.5539/res.v5n2p81
Asahara, A., Sato, A., & Maruyama, K. (2009). Evaluation of Trajectory Clustering Based on Information Criteria for Human Activity Analysis. 2009 Tenth International Conference on Mobile Data Management: Systems, Services and Middleware, 329–337. https://doi.org/10.1109/MDM.2009.65
Baker, R. S., & Hawn, A. (2022). Algorithmic Bias in Education. International Journal of Artificial Intelligence in Education, 32(4), 1052–1092. https://doi.org/10.1007/s40593-021-00285-9
Booth, B. M., Bosch, N., & D’Mello, S. (2023). Engagement detection and its applications in learning: A selective review. Proceedings of the IEEE, 111(9), 1026–1046. https://doi.org/10.1109/JPROC.2023.3309560
Chen, D.-L., Aaltonen, K., Lampela, H., & Kujala, J. (2025). The Design and Implementation of an Educational Chatbot with Personalized Adaptive Learning Features for Project Management Training. Technology, Knowledge and Learning, 30(2), 1047–1072. https://doi.org/10.1007/s10758-024-09807-5
Crompton, H., & Burke, D. (2023). Artificial intelligence in higher education: The state of the field. International Journal of Educational Technology in Higher Education, 20, 22. https://doi.org/10.1186/s41239-023-00392-8
Da Silva, F. L., Slodkowski, B. K., Da Silva, K. K. A., & Cazella, S. C. (2023). A systematic literature review on educational recommender systems for teaching and learning: Research trends, limitations and opportunities. Education and Information Technologies, 28(3), 3289–3328. https://doi.org/10.1007/s10639-022-11341-9
D’Mello, S., & Graesser, A. (2012). Dynamics of affective states during complex learning. Learning and Instruction, 22(2), 145–157. https://doi.org/10.1016/j.learninstruc.2011.10.001
Dorneich, M., Whitlow, S., Ververs, P. M., Carciofini, J., & Creaser, J. (2004). Closing the Loop of an Adaptive System with Cognitive State. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 48(3), 590–594. https://doi.org/10.1177/154193120404800367
Fan, C.-I., Shie, C.-H., Tseng, Y.-F., & Huang, H.-C. (2023). An Efficient Data Protection Scheme Based on Hierarchical ID-Based Encryption for MQTT. ACM Transactions on Sensor Networks, 19(3), 1–21. https://doi.org/10.1145/3570506
Ferguson, R. (2019). Ethical Challenges for Learning Analytics. Journal of Learning Analytics, 6(3), 25–30. https://doi.org/10.18608/jla.2019.63.5
Granström, M., & Oppi, J. (2025). Student engagement with AI tools in learning: Evidence from recent educational contexts. Frontiers in Education, 10, 1298456. https://doi.org/10.3389/feduc.2025.1688092
Holi, H. I. (2025). In-Class EMI Challenges Arising in an Arabian Gulf Engineering Programme. SAGE Open, 15(3), 21582440251367125. https://doi.org/10.1177/21582440251367125
Kasneci, E., Sessler, K., Küchemann, S., Bannert, M., Dementieva, D., Fischer, F., Gasser, U., Groh, G., Günnemann, S., Hüllermeier, E., Krusche, S., Kutyniok, G., Michaeli, T., Nerdel, C., Pfeffer, J., Poquet, O., Sailer, M., Schmidt, A., Seidel, T., … Kasneci, G. (2023). ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences, 103, 102274. https://doi.org/10.1016/j.lindif.2023.102274
Kaur, M., Dhalaria, M., Sharma, P. K., & Park, J. H. (2019). Supervised Machine-Learning Predictive Analytics for National Quality of Life Scoring. Applied Sciences, 9(8), 1613. https://doi.org/10.3390/app9081613
Lee, K.-A., & Lim, S.-B. (2023). Designing a Leveled Conversational Teachable Agent for English Language Learners. Applied Sciences, 13(11), 6541. https://doi.org/10.3390/app13116541
Martín-Moncunill, D., & Alonso Martínez, D. (2025). Students’ Trust in AI and Their Verification Strategies: A Case Study at Camilo José Cela University. Education Sciences, 15(10), 1307. https://doi.org/10.3390/educsci15101307
Melchor, F., Conejero, J. M., Fernández-García, A. J., Sánchez-Figueroa, F., & Rodríguez-Echeverría, R. (2026). An empirical evaluation of clustering processes for early detection of university dropout. International Journal of Data Science and Analytics, 22, 25. https://doi.org/10.1007/s41060-025-00965-y
Mello, F. L. D., & Souza, S. A. D. (2021). Decision Maker Profiling Using Their Mental Behavior Pattern. Frontiers in Psychology, 12, 667255. https://doi.org/10.3389/fpsyg.2021.667255
Munassar, N. M. A., & Al-hobishi, M. A. A. (2025). Dimensionality Reduction Techniques in Big Data and Their Impact on E-Learning. Journal of Science and Technology, 30(7), 12–28. https://doi.org/10.20428/jst.v30i7.3002
Najem, K., Seghroucheni, Y. Z., & Ziti, S. (2026). Behavioral clustering for adaptive learning: A data-driven alternative to static learning style models. International Journal of Information and Education Technology, 16(1), 196–204. https://doi.org/10.18178/ijiet.2026.16.1.2494
Park, S., Kim, S.-Y., Lee, H., & Kim, E. G. (2022). Professional development for English-medium instruction professors at Korean universities. System, 109, 102862. https://doi.org/10.1016/j.system.2022.102862
Rebolledo-Méndez, G., Huerta-Pacheco, S., Baker, R. S., & du Boulay, B. (2022). Meta-affective behaviour within an intelligent tutoring system. International Journal of Artificial Intelligence in Education, 32(1), 81–112. https://doi.org/10.1007/s40593-021-00247-1
Rosenberg, J. M., Schultheis, E. H., Kjelvik, M. K., Reedy, A., & Sultana, O. (2022). Big data, big changes? The technologies and sources of data used in science classrooms. British Journal of Educational Technology, 53(5), 1179–1201. https://doi.org/10.1111/bjet.13245
Shaffer, D. W., & Ruis, A. R. (2024). Theories All the Way Across: The Role of Theory in Learning Analytics and the Case for Unified Methods. In K. Bartimote, S. K. Howard, & D. Gašević (Eds.), Theory Informing and Arising from Learning Analytics (pp. 187–201). Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-60571-0_12
Shirkhorshidi, A. S., Aghabozorgi, S., & Wah, T. Y. (2015). A Comparison Study on Similarity and Dissimilarity Measures in Clustering Continuous Data. PLOS ONE, 10(12), e0144059. https://doi.org/10.1371/journal.pone.0144059
Tudor, I., Holenko Dlab, M., Đurović, G., & Horvat, M. (2025). Using Clustering Techniques to Design Learner Personas for GenAI Prompt Engineering and Adaptive Interventions. Electronics, 14(11), 2281. https://doi.org/10.3390/electronics14112281
Venkatesh Sharma, K., Ayiluri, P. R., Betala, R., Jagdish Kumar, P., & Shirisha Reddy, K. (2024). Enhancing query relevance: Leveraging SBERT and cosine similarity for optimal information retrieval. International Journal of Speech Technology, 27(3), 753–763. https://doi.org/10.1007/s10772-024-10133-5
Viberg, O., Khalil, M., & Baars, M. (2020). Self-regulated learning and learning analytics in online learning environments: A review of empirical research. Computers & Education, 156, 103878. https://doi.org/10.1016/j.compedu.2020.103878
Wang, S., Ren, J., & Bai, R. (2023). A semi-supervised adaptive discriminative discretization method that improves the discrimination power of regularised naive Bayes. Expert Systems with Applications, 225, 120094. https://doi.org/10.1016/j.eswa.2023.120094
Watson, D. S. (2023). On the Philosophy of Unsupervised Learning. Philosophy & Technology, 36(2), 28. https://doi.org/10.1007/s13347-023-00635-6
Zhu, M., & Wang, C. (2024). A Systematic Review of Artificial Intelligence in Language Education from 2013 to 2023: Current Status and Future Implications. https://doi.org/10.2139/ssrn.4684304