The Power of Normalised Word Vectors for Automatically Grading Essays

Robert Williams
InSITE 2006  •  Volume 6  •  2006
Latent Semantic Analysis, when used for automated essay grading, makes use of document word count vectors for scoring the essays against domain knowledge. Words in the domain knowledge documents and essays are counted, and Singular Value Decomposition is undertaken to reduce the dimensions of the semantic space. Near neighbour vector cosines and other variables are used to calculate an essay score. This paper discusses a technique for computing word count vectors where the words are first normalised using thesaurus concept index numbers. This approach leads to a vector space of 812 dimensions, does not require Singular Value Decomposition, and leads to a reduced computational load. The cosine between the vectors for the student essay and a model answer proves to be a very powerful independent variable when used in regression analysis to score essays. An example of its use in practice is discussed.
Automated Essay Grading, Latent Semantic Analysis, Singular Value Decomposition, Normalised Word Vectors, Electronic Thesaurus, Multiple Regression Analysis.
1 total downloads
Share this

Back to Top ↑