Automated Essay Scoring Versus Human Scoring: A Comparative Study

Authors

  • Jinhao Wang South Texas College
  • Michelle Stallone Brown Texas A & M University -- Kingsville

Keywords:

Automated essay scoring, human raters, group mean scores, WritePlacer, Texas Higher Education Assessment, One-Way Repeated Measures ANOVA, Paired Samples t test, technology, computer

Abstract

The current research was conducted to investigate the validity of automated essay scoring (AES) by comparing group mean scores assigned by AES and human raters. Data collection included two standardized writing tests – WritePlacer Plus and the Texas Higher Education Assessment (THEA) writing test. The research sample of 107 participants was drawn from a Hispanic serving institution in South Texas. The One-Way Repeated-Measures ANOVA and the follow-up Paired Samples t test were conducted to examine the group mean differences. Results of the tests indicated that the mean score assigned by IntelliMetric™ was significantly higher than faculty human raters’ mean score on WriterPlacer Plus test, and IntelliMetric™ mean score was also significantly higher than THEA mean score assigned by human raters from National Evaluation Systems. A statistically significant difference also existed between the human raters’ mean score on WritePlacer Plus and human raters’ mean score on THEA. These findings did not corroborate previous studies that reported non-significant mean score differences between AES and human scoring.

Downloads

Published

2007-10-18

How to Cite

Wang, J., & Brown, M. S. (2007). Automated Essay Scoring Versus Human Scoring: A Comparative Study. The Journal of Technology, Learning and Assessment, 6(2). Retrieved from https://ejournals.bc.edu/index.php/jtla/article/view/1632

Issue

Section

Articles