View the article’s original source
Les Perelman, former director of undergraduate writing at MIT has been a persistent critic of machine-scored writing on tests. He has previously demonstrated that students can outwit the machines and can game the system. He created a machine called BABEL, or Basic Automatic B.S. Essay Language Generator. He says that the computer cannot distinguish between gibberish and lucid writing.
He wrote the following as a personal email to me, and I post it with his permission.
Measurement Inc., which uses Ellis Paige’s PEG (Project Essay Grade) software to grade papers all but concedes that students in classrooms where the software has been used have been using the BABEL generator or something like it to game the program. Neither vendor mentions that the same software is also being used to grade high stakes state tests, and in the case of Pearson, is being considered by PARCC to grade Common Core essays.
What is meant by a “good faith” essay?
It is important to note that although PEG software is extremely reliable in terms of producing scores that are comparable to those awarded by human judges, it can be fooled. Computers, like humans, are not perfect.
PEG presumes “good faith” essays authored by “motivated” writers. A “good faith” essay is one that reflects the writer’s best efforts to respond to the assignment and the prompt without trickery or deceit. A “motivated” writer is one who genuinely wants to do well and for whom the assignment has some consequence (a grade, a factor in admissions or hiring, etc.).
Efforts to “spoof” the system by typing in gibberish, repetitive phrases, or off-topic, illogical prose will produce illogical and essentially meaningless results.
Also, both PEG Writer and Pearson’s WriteToLearn concede in buried FAQ’s that their probabilistic grammar checkers don’t work very well.
PEG Writing by Measurement Inc.
PEG’s grammar checker can detect and provide feedback for a wide variety of syntactic, semantic and punctuation errors. These errors include, but are not limited to, run-on sentences, sentence fragments and comma splices; homophone errors and other errors of word choice; and missing or misused commas, apostrophes, quotation marks and end punctuation. In addition, the grammar checker can locate and offer feedback on style choices inappropriate for formal writing.
Unlike commercial grammar checkers, however, PEG only reports those errors for which there is a high degree of confidence that the “error” is indeed an error. Commercial grammar checkers generally implement a lower threshold and as a result, may report more errors. The downside is they also report higher number of “false positives” (errors that aren’t errors). Because PEG factors these error conditions into scoring decisions, we are careful not to let “false positives” prejudice an otherwise well constructed essay.
Pearson Write to Learn
The technology that supports grammar check features in programs such as Microsoft Word often return false positives. Since WriteToLearn is an educational product, the creators of this program have decided, in an attempt to not provide students with false positives, to err on the side of caution. Consequently, there are times when the grammar check will not catch all of a student’s errors.
MS Word used to produce a significant number of false positives but Microsoft in the current versions appears to have raised the probabilistic threshold so that it now underreports errors.