College English Test Band 4 (CET 4 ) is the national English level test in our country. Thousands of students in college will take the examination each year. Besides the multiple choice questions, there is an essay writing in the test. So how to score the writing more effectively and efficiently became a hot research issue. In order to solve the problem, LI Yali and YAN Yonghong of Institute of Acoustics, Chinese Academy of Sciences carried out a series of studies and developed an automated essay scoring system for CET4.
The researchers give score on several components including some surface features, grammar checking, sentences and whether the essay is off-topic. For the surface feature, they used the number of words, number of sentence, average word length, average sentence length etc. For grammar checking, they use two bigram models trained on the reference corpus both in words and part-of-speech tags. In sentence scoring component, they use the portion of short part-of-speech tag sequence match to the reference corpus and the sentence error detection written by rules.
For detecting off-topic essays, they use two approaches. One is simply comparing key words in the topic and the article and the other is content vector analysis model. In the end, the researchers use the linear regression to get a final score. They get the result of 70.125% precision given the two scores deviation and average deviation of 1.955 compared to human score on real CET4 data.