Automatic Assessment of Short Answers Using Latent Semantic Analysis

  • Teodora Mihajlov University of Belgrade


Implementing technology in a modern-day classroom is an ongoing challenge. In this paper, we created a system for an automatic assessment of student answers using Latent Semantic Analysis (LSA) – a method with an underlying assumption that words with similar meanings will appear in the same contexts. The system will be used within digital lexical flashcards for L2 vocabulary acquisition in a CLIL classroom. Results presented in this paper indicate that while LSA does well in creating semantic spaces for longer texts, it fell somewhat short of detecting topics in answers and word definitions. The answers were classified using KNN, for both binary and multinomial classification. The results of KNN classification are as follows:  precision P=0.73, recall R=1.00, F1=0.85 for binary classification, and P=0.50, R=0.47,  F1=0.46 score for the multinomial classifier. The results are to be taken with a grain of salt, due to a small test and training dataset.

How to Cite
MIHAJLOV, Teodora. Automatic Assessment of Short Answers Using Latent Semantic Analysis. Infotheca - Journal for Digital Humanities, [S.l.], v. 23, n. 1, p. 78-106, sep. 2023. ISSN 2217-9461. Available at: <>. Date accessed: 22 apr. 2024. doi: