Statistician Mikhail Simkin has made a great discovery. In a much-publicized paper titled "Scientific evaluation of Charles Dickens," published in the Journal of Quantitative Linguistics, he argues there's no real difference between Charles Dickens, widely considered one of the greatest English-language writers, and his lesser contemporary Edward Bulwer-Lytton, most famous for writing the sentence "It was a dark and stormy night."
Simkin bases his assertion on a quiz he created and posted online. He describes the test and its results as follows:
Edward Bulwer-Lytton is the worst writer in [the] history of letters. An annual wretched writing contest was established in his honor. In contrast, Charles Dickens is one of the best writers ever. Can one tell the difference between their prose? To check this I wrote the "Great prose or not?" quiz. It consists of a dozen of [sic] representative literary passages, written either by Bulwer-Lytton or by Dickens. The takers are to choose the author of each quote. ... The average score is 5.78 or 48.2% correct.
Because of the low accuracy of these results ("our quiz-takers lost to a monkey" is how Simkin puts it), our statistician concludes that the works of "the worst writer" and "one of the best writers" are substantially the same.
The paper is somewhat cheekily written, and we can offer Simkin the benefit of the doubt and assume he underwent this project merely as a thought experiment (though he appears to be making a career out of this sort of thing: In a previous paper, he "proved" there was no difference between Mozart and Salieri). Even so, given the attention it has received in the British press, the paper needs a serious debunking—and, more importantly, so does its underlying, apparently widespread assumption about the literary canon.
First, hardly anyone argues that Edward Bulwer-Lytton was the worst writer of all time. That someone could even think of making that contention in the age of Twilight and 50 Shades of Grey boggles the mind. Mediocre he may have been, but a joke contest inspired by seven words that he wrote cannot stand as the sole proof that he was the most awful author ever.
The method of collecting data seems just as shaky. For instance, here's how he backs up his claim that "even educated people can't tell Dickens from Bulwer":
We can address the education issue in a scientific way. Fortunately, the quizzing script records taker's [sic] IP address. From it, one can infer where their computers were located. I selected a subset of scores, which were received by people coming from English-speaking ... universities. ... The average score is 5.76 or 48.0% correct.
So we know we have an educated subset because of their location, even though we have no idea whether the person sitting at any given computer was a professor, a student, or a janitor.
But does the quiz itself have any merit? It's still available online. It's worth noting that only descriptive passages are used; there's virtually nothing involving plot or characterization, even though, as Simkin admits in his paper, these tend to be essential to novels.
This omission puts Dickens, known for strong and unique characterization, at a distinct disadvantage. If any passages had involved the brutal Wackford Squeers, or the grandiloquent Wilkins Micawber, or the grotesque Miss Havisham, quiz-takers just might have done better. Dickens's characters are very hard to mistake for anyone else's characters.
Nonetheless, I took the quiz, and scored 92 percent. Here's how: