What Happens to English When You Subtract the Letter 'E'
One 1939 book written entirely without the language's most common character uses more "G" words and fewer prepositions.
Some novels are known for their creative plots and compelling characters. And then there is Gadsby, a 50,000-word novel written in 1939 by American author Ernest Vincent Wright. It’s mainly known for having been written without “e,” the most common letter in the English language.
David Taylor, a blogger with an interest in language and data analysis, has gone through the full text of Gadsby to see how dropping “e” affects the way English is written.
Removing “e” is a major challenge. You can’t use “the.” Many pronouns—"he," "she," "they"—are unavailable. Here’s the first paragraph of the book, which gives a hint of just how awkward and difficult things get without English’s workhorse letter:
If youth, throughout all history, had had a champion to stand up for it; to show a doubting world that a child can think; and, possibly, do it practically; you wouldn’t constantly run across folks today who claim that “a child don’t know anything.” A child’s brain starts functioning at birth; and has, amongst its many infant convolutions, thousands of dormant atoms, into which God has put a mystic possibility for noticing an adult’s act, and figuring out its purport.
To see how this kind of writing changes the language, Taylor compared Gadsby to the Brown University Corpus, a linguistics storehouse containing about 1 million words of American text printed in 1961. He calculated the “Log-Likelihood keyness” for words in the novel, a statistical method used by linguists to make sure that word variation between texts is not simply random. Using that test, he identified the words that punch highest above their weight, and those for which usage suffers the most, when “e” is taken out of the equation:
Word Frequency in Gadsby Versus the Brown University Corpus
Taylor explains that the words at the bottom of that chart, like “of,” “to,” and “in,” are used much less frequently because they are often used next to “e” words, in particular “the.” Think of “most of the time,” “to the store,” or “in the house.”
He also calculated the relative frequency of the other 25 letters as they are used in Gadsby, compared with the Brown corpus.
How Removing "E" Changes the Letters Used
Of all the letters in the alphabet, “g” gains the most ground while “f” falls furthest. “Of,” the biggest single-word loser, certainly has something to do with the drop in “f” frequency. Other common “f” words, like “for,” “if,” and “from,” also commonly follow “e” words.
And here are the top “g” words—you already might have guessed the first:
"G" Words That Appear Most in Gadsby
In short, it’s so difficult to craft any unit of words without using our most common linguistic symbol, that doing so dramatically impacts vocabulary.